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Subject to question 


Even when conducting clinical trials to study widely used therapies, researchers must ensure that 


they disclose the fullrisks to patients. 


test subjects for biomedical research has been a bedrock of ethi- 
cal protections for decades. Now, a fresh question has come to 
the fore: how best to protect human subjects in trials that examine the 
effectiveness of existing therapies that are already in widespread use. 

On 28 August, the US office charged with protecting human 
research subjects will hold an unusual public meeting in Washing- 
ton DC to tackle this contentious, complex issue, which has polarized 
the biomedical community in recent months. The Office for Human 
Research Protections (OHRP), part of the Department of Health and 
Human Services, is asking for input on how institutional ethics com- 
mittees — the advisory boards that decide whether proposed trials can 
go ahead — should assess the risks to people in randomized studies 
that investigate the risks and benefits of existing treatments for the 
same condition. Such ‘standard of care’ trials are likely to become more 
widespread after being mandated in the 2010 health-care law, so a lot 
is riding on what the OHRP decides. It might insist that these risks be 
spelled out on patient-consent forms, even though patients with a par- 
ticular condition would be taking one or the other medication anyway. 
Those who argue for looser regulations of such research say that this 
move could put many volunteers off, because they might mistakenly 
think that the research itself is adding risk of harm. 

The issue has been thrust into the spotlight by a protracted con- 
troversy over a study of extremely premature infants, funded by the 
US National Institutes of Health. From 2005 to 2009, the Surfactant, 
Positive Pressure, and Oxygenation Randomized Trial (SUPPORT) 
enrolled 1,316 infants born, on average, 14 weeks early and weighing 
less than a kilogram. Such infants struggle to breathe because of their 
immature lungs and so are given extra oxygen from birth. Those in the 
trial were assigned at random to one of two groups. In one, blood oxy- 
gen levels were kept at the higher end of the range used in US hospitals, 
with the attendant risk of causing an eye disorder called retinopathy 
of prematurity (ROP) — an abnormal growth of retinal blood vessels 
that blinds 400-600 US infants every year. In the other group, oxygen 
levels were kept at the lower end of the range, with the accompanying 
risks including neurodevelopmental disorders and, some experts in 
the field believed, death. The goal was to determine the effects of lower 
or higher oxygen levels on the infants’ survival, neurological develop- 
ment and likelihood of developing ROP. In short, the trial sought the 
sweet spot — the level of oxygen supplementation that would lead to 
maximum survival without damage. 


f ull disclosure of the potential risks to people who volunteer to be 


RISK AVERSE 

In 2011 the OHRP, responding to a complaint, began to investigate 
the informed consent forms signed by parents at the 23 SUPPORT 
sites. In March this year, it concluded that the forms failed to describe 
“the reasonably foreseeable risks of blindness, neurological damage 
and death” All but two of the forms failed to note, for instance, that 


infants in the group maintained at higher oxygen levels would have a 
greater chance of eye damage, yet more than half said that infants in 
the lower-level group could benefit from a lower risk of eye disease 
or less need for eye surgery. None noted the increased risks of neuro- 
developmental disorders in the lower-level group. None listed death as 
a possible risk of the procedure, although the trial protocol (not seen 
by parents) did list death among the related adverse events “that may 
be related to the study”. The consent forms did reassure parents that: 

“Because all of the treatments proposed in 


“Transparency this study are within standard of care, there is 
and respect no predictable increase in risk for your baby.” 
for research Much of the biomedical establishment 
subjects must has rallied to support the trial investigators 
be beyond and the ethics committees that approved the 
reproach.” informed consent forms. They argue that 


the babies encountered a set of grave risks 
inherent to being premature, not to being randomly assigned to one 
or the other arm of the trial. Because the trial administered treatment 
within accepted guidelines endorsed by the American Academy of 
Pediatrics, they say, the study added no risk and thus the consent forms 
were adequate. 

The goals of SUPPORT were laudable and addressed a need for 
better information for physicians. And the study did produce illu- 
minating findings: the infants who received lower levels (aiming to 
keep the oxygen saturation of their haemoglobin at 85-89%) were 
less likely to get severe eye disease — but more likely to die — than 
infants receiving oxygen at 91-95% saturation levels. But in an age in 
which it is more important than ever that transparency and respect for 
research subjects must be beyond reproach, the SUPPORT consent 
forms simply do not pass muster. And although it is true that, col- 
lectively, the infants enrolled in the study may have been at no greater 
risk of a negative outcome than infants who were not enrolled, it is not 
collectives who sign informed consent documents. It is individuals. 

Put yourself in the position ofa parent with an extremely premature 
infant. Would you make the decision to enrol your child in the trial if 
the consent form stated in simple language that babies assigned to one 
group were more likely to go blind, and that those in the other were ata 
higher risk of getting neurodevelopmental disabilities? Equally, would 
you decide to enrol if the form spelled out that, ifyou do not take part, 
your own physician and institution might keep your infant in the mid- 
dle of the range, trying to avoid either outcome? Perhaps you might, 
but you would do so with full knowledge of the attendant risks. The 
parents in this case could not do so. 

In June, under pressure from many sides, the OHRP said that it 
would not sanction the SUPPORT investigators and instead would 
hold next week’s meeting. No matter the thorniness of the issues raised 
there, research is still research in whatever context, and the duty to 
protect human subjects must remain paramount. = 
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announced by the Russian government in June were met with 

almost unanimous opposition in the scientific community. 
Critics have complained that the severity of the proposed changes — 
which include transferring properties owned by the academy into the 
hands of the government — is combined with a vagueness about how 
they will be implemented. Furthermore, the abrupt announcement 
came with political pressure and a smear campaign in state-owned 
media, but without public debate. The government response is that all 
opinions have been stated already many times (which is partially true), 
that the reform has only just started and the detail will be clarified 
later, and that the only way to move forward is, well, to move forward. 

The reform bill is currently in the state Duma, where it will receive its 
final reading next month. The signals from the Duma are inconclusive: 
although some members, including the speaker, Sergei Naryshkin, men- 
tioned the possibility of returning the bill to the 
second-reading stage, where substantial amend- 
ments are possible, there have been no official 
statements along these lines. 

The government has managed to achieve 
the seemingly impossible: it has brought Rus- 
sian science together. Academic stalwarts who 
oppose any change (aside from an increase in the 
academy’s budget) have united with proponents 
of (reasonable) reform, long-time critics of the 
academy and scientists who normally run shy of 
politics. Despite summer vacations, some mem- 
bers of the scientific community are discussing 
the post-reform system, and others are planning 
meetings and strikes that aim to overturn the proposals. A meeting of 
all groups working on projects that relate to the reforms is scheduled 
for the end of August. 

Some of the ideas being discussed seem more realistic than others. 
With its head firmly in the sand, the presidium of the academy has 
prepared a list of amendments to the bill that mainly aim at returning 
to the pre-reform status quo. Another working group, formed by the 
Scientific Council of the Ministry of Science (independent research- 
ers who are largely critical of the reforms) and the Society of Scientific 
Researchers (an independent, informal society with free membership 
that is restricted only by a publications-based qualification) has offered 
other suggestions. These tackle fundamental issues such as whether 
Russian science should be arranged around institutes or laboratories, 
what the balance should be between guaranteed and grant-based fund- 
ing, and whether academy research should be subject to international 
review. At their heart, these discussions debate 


Ts reforms to the Russian Academy of Sciences that were 


whether the future of the academy isasalearned NATURE.COM 
society, similar to the UK Royal Society, or as __ Discuss this article 
a Soviet-style ‘ministry of basic sciences’ that _ online at: 
manages and funds its institutes. go.nature.com/scrf8y 


THERE IS GENERAL 


AGREEMENT 


IN RUSSIA THAT 
CHANGE 
IS OVERDUE. 


What is to be done about 
Russian science? 


Government reforms to the Russian Academy of Sciences have met with 
controversy, but some form of change is needed, argues Mikhail Gelfand. 


One burning problem acknowledged by most of those working on 
possible alternatives to the government reform is the future working 
relationships among the academy, its institutes and a new agency set 
up by the proposed law to handle academy property. The bill provides 
no details and is unclear about whether this property includes the land, 
buildings and equipment that are directly used for research purposes. 
In particular, scientists worry that all purchases will need to be approved 
by bureaucrats with no understanding of science. 

In the words of a famous Russian novel, what is to be done? A pre- 
requisite for successful reform is the creation of a transparent funding 
system that also features regular international assessment of laborato- 
ries, institutes and large projects. A more strategic goal should be to 
mend the split between research and higher education. This should 
not be done by simply increasing the financing of universities at the 
cost of research institutes, but rather by encouraging the educational 
activity of institute researchers and the research 
activity of university professors. Specific grant- 
based support of joint projects between institutes 
and universities is also needed. The teaching load 
of university professors, which is currently much 
higher than that of professors in the West, must 
be decreased, and regular audits should cover not 
only the academy but also other research centres. 
Finally, Russia’s leaders need to understand that 
science cannot be expected to produce immedi- 
ate results in the form of ‘innovations, but instead 
needs to be judged on its own merits. 

Ultimately, deep reform can be implemented 
only if the government has a popular mandate for 
change. This is not the case in Russia. Hence all reforms are met with 
distrust and a search for a hidden agenda. This distrust has been fuelled 
by a project to incorporate several physics institutes into the Kurchatov 
Institute, the head of which, Mikhail Kovalchuk, is widely believed to be 
atrusted adviser of President Vladimir Putin. The centre enjoys a steady, 
rich flow of finance, despite a scientific output that is much weaker than 
academic institutes. The Institute of Theoretical and Experimental Phys- 
ics, formerly one of the top scientific institutions, all but ceased to func- 
tion when it was incorporated into the Kurchatov Institute. 

There is general agreement in Russia that change is overdue; even the 
new academy leadership acknowledges this. Forms of change separate 
from the unpopular government proposals could work. But it is unclear 
whether research in Russia can make the shift, given the current politi- 
cal climate and the academy’s systemic, deeply rooted problems. m 


Mikhail Gelfand is a professor of the Faculty of Bioengineering and 
Bioinformatics, Moscow State University at and deputy director of 
the Institute for Information Transmission Problems of the Russian 
Academy of Sciences. 

e-mail: mikhail.gelfand@gmail.com 
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Selections from the 
scientific literature 


RESEARCH HIGHLIGHTS 


ANIMAL BEHAVIOUR 


How termites 
drum up help 


To call for assistance when 
their nest is under attack, some 
termites use their heads. 
Wolfgang Kirchner and 
Felix Hager of Ruhr University 
in Bochum, Germany, 
mimicked predator attacks 
on two species of African 
termites that grow fungi in 
long underground galleries 
connected to their nests. 
Specialized soldier termites 
responded by drumming their 
heads against the ground, 
which drew more soldiers 
to the alarm. Laboratory 
experiments confirmed 
that soldiers sense the low- 
frequency vibrations. 
Vibrations from simulated 
drumming dissipated within 
40 centimetres, but many 
galleries are much longer. 
To transmit alarm calls over 
greater distances, the termites 
pass signals on to others until 
the messages reach soldiers, 
the researchers suggest. 
J. Exp. Biol. 216, 3249-3256 
(2013) 


Soil life predicts 
nutrient flow 


Studies of soil organisms are 
usually lab-based, but in a rare 
field study, Franciska de Vries 
— nowat Lancaster University, 
UK — and her colleagues 
looked at the relationship 
between soil food webs and 
carbon and nitrogen entering 
and leaving controlled areas. 
The 60 sites — in Sweden, 
the United Kingdom, the 
Czech Republic and Greece — 
included grassland, intensely 
farmed sites and areas with 
crop rotation. 

Intensive land use, such as 
wheat cultivation, reduced the 
mass of soil life of all kinds. 


Global heat waves on the rise 


Heat waves will become more common by 
2040. Climate models used by Dim Coumou 
of the Potsdam Institute for Climate Impact 
Research in Germany and Alexander Robinson 
at the Complutense University of Madrid 
predict that about 20% of Earth’s land surface 
will experience monthly temperatures that are 
more than three standard deviations from the 
mean. Such extremes occur over about 5% of 
the global land surface today, and were seen in 


But the researchers found 
that biomass within soil was 
a better predictor of nutrient 
cycling and soil health than 
was land usage, and suggest 
that nutrient models should 
pay more attention to what 
happens underground. 
Proc. Natl Acad. Sci. USA 
http://dx.doi.org/10.1073/ 
pnas.1305198110 (2013) 


CANCER IMAGING 


Chemical reaction 
reveals tumours 


A chemical-imaging technique 
may one day allow tracking 
of prostate cancer without the 
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need for invasive biopsies. 

A team led by Sarah Nelson 
at the University of California, 
San Francisco, exploited 
differences in how healthy and 
cancer cells break down certain 
chemicals, using them to spot 
tumours in 31 human patients. 

The researchers used 
magnetic resonance imaging 
to observe isotopically labelled 
pyruvate, a compound that 
supplies energy to cells. 

Shortly after injecting the 
labelled pyruvate into patients, 
researchers could observe it 
being converted into lactate 

in prostate tumours, and the 
conversion sometimes revealed 
cancer in regions that had been 


the 2012 heat wave across the United States and 
in the Texan heat wave of 2011, when reservoirs 
nearly dried up (pictured). 

The heat-wave projections stand until 2040, 
no matter how much more carbon dioxide 
humans put into the air. After that, lowered 
emissions could allow temperatures to stabilize, 
whereas maintaining current emissions would 
see the frequency of heat waves continue to rise. 
Environ. Res. Lett. 8,034018 (2013) 


overlooked by conventional 
imaging. Signals that were 
more intense indicated faster 
metabolism of pyruvate, a 
property that has been linked 
in animal studies to more- 
aggressive forms of cancer. 
Sci. Transl. Med. 5, 198ra108 
(2013) 


MATERIALS 


Catalyst forms 
under pressure 


High pressure normally turns 
the porous minerals known as 
zeolites into a powdery, non- 
crystalline mess. Chemists 
have now shown that this is not 


JULIE DERMANSKY/CORBIS 


THOMAS DAHLGREN, ADRIAN GLOVER 


CHRISTINA CORBACI, ZACHARY BORNHOLDT, ERICA OLLMANN SAPHIRE/TSRI 


always the case, by converting 
a zeolite into a stable new 
mineral using high-pressure 
compression. 

Zeolites are often used as 
catalysts because their pores 
can trap a range of molecules. 
Depending on zeolite structure, 
the minerals can break up 
heavy oil, separate out gases or 
purify water. 

In the hunt for fresh zeolite 
structures, Avelino Corma 
at the Polytechnic University 
of Valencia, Spain, and his 
co-workers used diamond 
anvils to compress the minerals. 
At 32,000 times atmospheric 
pressure, a type of silica zeolite 
transformed irreversibly into 
another porous structure, 
which was better at separating 
propene and propane than its 
parent form. 

Angew. Chem. http://dx.doi. 
org/10.1002/anie.201305230 
(2013) 


STRUCTURAL BIOLOGY 


Lethal viral 
shape-shifter 


An ebolavirus protein 

adopts drastically different 

conformations (pictured) 

throughout its life cycle, 

allowing the deadly virus to do 

more with fewer genes. 
Ebolaviruses kill up to 

90% of the people they infect. 

Erica Ollmann Saphire of the 

Scripps Research Institute 

in La Jolla, California, 

and her colleagues used 

crystallography, biochemistry 

and microscopy to track the 

structure of VP40, a protein 

that controls ebolavirus 


assembly and exit from host 
cells. They learned that the 
protein does not travel alone 

as previously thought, but 
moves to the cell membrane in 
butterfly-shaped pairs, which 
then align end-to-end into 
hexamers that form filaments 
essential for viral assembly and 
release. The team also analysed 
requirements for VP40 to 
form yet another structure, a 
previously observed ring that 
binds to RNA and regulates 
viral genes in infected cells. 
Cell 154, 763-774 (2013) 


Mouth microbe 
causes cancer 


Certain bacteria living in the 
mouth and gut can invade 
intestinal cells and trigger 
changes that lead to colorectal 
cancer. 

A team led by Wendy 
Garrett at the Harvard School 
of Public Health in Boston, 
Massachusetts, found that 
the bacterium Fusobacterium 
nucleatum induced colonic 
tumours in genetically 
susceptible mice. 

Separately, Yiping Han 
at Case Western Reserve 
University in Cleveland, Ohio, 
and her colleagues showed that 
FadA, an adhesion molecule 
produced by F. nucleatum, 
interacts with a counterpart on 
mammalian cells and triggers 
proliferation of colorectal- 
cancer cells. Colon tissue from 
patients with tumours had 
100 times more copies of the 
gene encoding FadA than did 
tissue from healthy individuals. 
Cell Host Microbe 14, 195-206; 
207-215 (2013) 


PSYCHIATRIC GENETICS 


Common variants 
behind disorders 


The risk of getting a 
psychiatric illness is largely 
heritable — and many of the 
genetic variants involved seem 
to be shared across disorders. 
The international 
Cross-Disorder Group of 
the Psychiatric Genomics 
Consortium identified 


RESEARCH HIGHLIGHTS MiiiSaiaa¢ 


COMMUNITY 
CHOICE 


Commitment beats will 


Avoiding temptation is more effective than 
resisting it. 

Molly Crockett, now at University 
College London, and her colleagues tested 
78 men as they relied on willpower (resisting an available 
temptation) or precommitment (voluntarily restricting access 
to temptation) to obtain rewards. 

After rating a set of erotic images, subjects could choose 
to view a less-enjoyable image immediately or a more- 
enjoyable one after a delay. In willpower tasks, the option to 
see the less-preferred image was always available, whereas in 
precommitment tasks, men chose at the outset whether to wait 
for a preferred image. Participants were more likely to gain the 
superior reward in precommitment scenarios, with the benefits 
of precommitment varying across individuals. Imaging ofa 
subset of 20 men revealed that different brain areas were used 


> HIGHLY READ 


on cell.com 


for precommitment and willpower. 


Neuron 79, 391-401 (2013) 


common genetic variants in 
more than 30,000 patients 
diagnosed with one of five 
psychiatric disorders, and 
compared these with thousands 
of non-diagnosed controls. 
These variants accounted 
for 17-29% of risk for the 
illnesses, and there is substantial 
overlap between disorders. For 
example, in schizophrenia, 
15% of the variants overlapped 
with bipolar disorder, 9% with 
depression and 3% with autism. 
Nature Genet. http://dx.doi. 
org/10.1038/ng.2711 (2013) 


Bone-eating 
worms in icy seas 


Two species of bone-devouring 
worms have been discovered 
in the cold waters of the 
Antarctic. Other members of 
this genus had previously been 
found only at warmer latitudes. 
Scientists led by Thomas 
Dahlgren at the company Uni 
Research in Bergen, Norway, 
found a new species of worm 
(Osedax antarcticus; pictured) 
carpeting whale bones that 
the team had placed on the sea 
floor. Another Osedax species 
was found on bones left in 


shallower water. 


Pine and oak planks placed 
with the bones remained in 
near-pristine condition, free 
of the marine invertebrates 
that usually feed on wood in 
warmer waters and quickly 
consume sunken ships. As a 
result, the researchers suggest 
that shipwrecks on the cold 
sea floor will stay remarkably 
well-preserved. 

Proc. R. Soc. B 280, 20131390 
(2013) 

For a longer story on this research, 
see go.nature.com/kb2kix 


© NATURE.COM 

For the latest research published by 
Nature visit: 
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SEVEN DA 


No money for eggs 


California governor Jerry 
Brown vetoed a proposed 

law on 13 August that would 
have allowed payments to 
women who donate their eggs 
for scientific research — a 
move that may deter other 
states from attempting to ease 
similar bans. The measure 
would have boosted the 
availability of human eggs 

for research in fields such 

as cloning or somatic-cell 
nuclear transfer. Separate 
rules prohibit the California 
Institute for Regenerative 
Medicine in San Francisco, the 
state's stem-cell agency, from 
funding research on stem-cell 
lines created with eggs from 
paid donors. See go.nature. 
com/xkelfv for more. 


Nuclear waste 

A US appeals court has ruled 
that the country’s Nuclear 
Regulatory Commission 

must revive its review of 

the Department of Energy's 
application to open a nuclear- 
waste repository at Yucca 
Mountain in Nevada. The 
energy department had sought 
to withdraw its application 

in March 2010, after years 

of political controversy and 
concerns over whether the site 
would leak radioactive waste 
(see Nature 473, 266-267; 
2011). But on 13 August, 

the court ruled that the 
regulatory commission must 
continue reviewing the energy 
department’s application. 


Hydroelectric halt 
India’s Supreme Court has 
ordered that the construction 
of additional hydroelectric 
dams must be suspended 

in the Himalayan state of 
Uttarakhand. After flash 
floods and landslides in June 
killed thousands of people 

in the state, environmental 
groups blamed the recent rapid 


The news in brief 


Carnivore misidentified for decades 


A nocturnal, tree-dwelling mammal with a 
bushy tail and teddy-bear-like face is the first 
new carnivore species to be identified in the 
Western Hemisphere in 35 years. Dubbed 

the olinguito (Bassaricyon neblina, infant 
pictured), the 75-centimetre-long inhabitant 
of the Andean cloud forests had been mistaken 
in museums and zoos for a close relative — the 
olingo — for more than a century. Zoologists 


expansion of hydroelectric 
projects as a contributing 
factor. On 13 August, the court 
ordered that an expert group 
should be set up to study the 
environmental impact of dams, 
tunnels and deforestation 
associated with hydroelectric 
plants, and to assess whether 
such projects precipitated 
June's tragedy. See go.nature. 
com/pjvsp4 for more. 


Amazon drilling 
Ecuador's President Rafael 
Correa has abandoned an 
initiative intended to persuade 
wealthy nations to pay his 
country not to drill for oil 

in an Amazon rainforest 
reserve. Correa had asked 
for US$3.6 billion — 50% of 
the estimated revenue from 
development — in exchange 
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for protection of the Yasuni 
National Park. After receiving 
only $13 million in pledges 
since the initiative was mooted 
in 2007, Correa announced on 
15 August that the government 
would move forward with 
drilling. 


| _BUSINESS 
Bribery probe 


Ina widening crackdown, 
China will launch a three- 
month bribery probe across 
multiple business sectors, 
including the pharmaceutical 
and medical-services 
industries, the country’s 

State Administration for 
Industry and Commerce said 
on 14 August. Earlier this 
month, the authorities began 
probing Sanofi, based in Paris, 


who reported the finding on 15 August 

(K. M. Helgen et al. ZooKeys 324, 1-83; 2013) 
first discovered the mix-up by studying 
decades-old museum samples, eventually 
confirming their findings by tracking a live 
olinguito in Ecuador in 2006. DNA tests 
revealed that an olingo kept in US zoos during 
the 1960s and 1970s was actually an olinguito. 
See go.nature.com/acqawh for more. 


over claims that the company 
offered 1.69 million renminbi 
(US$274,000) in bribes to 
physicians in China. In July, 
the Chinese government 
opened an investigation 

into senior executives of 
GlaxoSmithKline in China for 
allegedly bribing officials and 
physicians in the country to 
boost drug sales. 


India drug patents 
Following media reports, 
Swiss drug-maker Roche 

has confirmed to Nature 

that it will stop enforcing an 
Indian patent that would have 
protected its breast-cancer 
drug trastuzumab (Herceptin) 
until 2019. Makers of generic 
pharmaceuticals may now 
sell cheaper versions of the 
drug, easing tensions over 


JUAN RENDON 
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SOURCE: WWW.POLIOERADICATION.ORG 


drug prices that have seen 
India reject some patents and 
skirt others by granting local 
‘compulsory licences’ (see 
Nature 500, 266; 2013). By 
avoiding compulsory licensing, 
the compromise may help 
Roche to maintain long-term 
access to the Indian market. 


Planet-naming code 


Long-promised guidelines for 
the public naming of planets 
and moons were issued by the 
International Astronomical 
Union on 14 August. The 
Paris-based organization, 
which oversees planetary 
nomenclature, asks that any 
group gathering candidate 
names follow the guidelines. 
Among other things, the rules 
discourage using the names 
of pet animals and forbid 

the collection of money in 

the naming process. They 
were prompted in part by the 
actions of Uwingu, a space- 
education company in Boulder, 
Colorado, that in February 
asked the public to pay to vote 
on candidate planet names (see 
Nature 496, 407; 2013). 


Space dust trails 
NASA scientists have tracked 
a dust cloud that was dumped 
in Earth’s stratosphere by a 
meteor explosion in February 
over Chelyabinsk in Russia, 
the agency announced on 

14 August. Using satellite data 


TREND WATCH 


Numbers of cases of polio are 


falling in Nigeria, Pakistan and 


Afghanistan, countries where 
wild poliovirus is endemic. 
But outbreaks have occurred 
in Somalia and Kenya, part of 
a band of African countries in 


which imported poliovirus tends 
to cause periodic reinfections. In 
Somalia, emergency vaccinations 


could prove particularly 
difficult. The medical charity 


Médicins Sans Frontiéres said on 
14 August that it would close all 
its programmes there because of 


attacks on staff. 


and atmospheric models, 
researchers found that the 
space dust — estimated to 
weigh hundreds of tonnes — 
reached an altitude of 

40 kilometres within hours of 
the blast. It then swirled around 
the Northern Hemisphere 

for days, forming a band 
(pictured) that lasted for at 
least 3 months. The findings 
will be published in the journal 
Geophysical Research Letters. 


Kepler kaput 


NASA is abandoning attempts 
to resuscitate the crippled 
Kepler Space Telescope, 

the agency announced on 

15 August. Engineers have 
spent months trying to repair 
two of the telescope’s four 
gyroscope-like wheels, which 
are crucial for controlling its 
movement. The first wheel 
failed in July 2012, witha 
second one breaking in May 
(see go.nature.com/4w1ufr). 
The spacecraft needs at 


least three working wheels 

to carry out its search for 
Earth-like exoplanets that 
might support life. Although 
Kepler completed its primary 
mission in 2012, it had begun 
an extended mission that was 
scheduled to end in 2016. 


Forensics clash 


Senior forensic scientist 

Wang Xuemei has resigned as 
vice-president of the Chinese 
Forensic Medicine Association, 
it was reported on 18 August. 
Inan online resignation video, 
she said that she could no 
longer be associated with the 
academic organization behind 
“ridiculous and irresponsible 
conclusions’, referring to her 
doubts over the association's 
determination that a man 
died accidentally in 2010 

by falling on to electrified 
subway tracks in Beijing. She 
added that she had resolved 


POLIO FLARES UP IN SOMALIA 


A severe outbreak of polio in the Horn of Africa is worrying 


public-health officials. 


7% 2012 to 13 August ® 2012 total 4 2013 to 13 August 
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SEVEN DAYS | THIS WEEK | 


23 AUGUST 

Hoping to bring the 
International Linear 
Collider to Japan, the 
country’s high-energy- 
physics community 
holds a press conference 
to announce the site it 
has chosen to host the 
proposed atom smasher. 


28 AUGUST 

The Office for Human 
Research Protections 
hosts a public meeting 
in Washington DC to 
discuss guidelines on 
informed consent for 
clinical trials that test 
the high and low ranges 
of standard medical 
practice (see page 377). 
go.nature.com/Ib42by 


to quit the forensic system in 
China. Wang is well known for 
criticizing last year’s conviction 
of Gu Kailai for poisoning a 
British businessman, although 
Wang's video did not refer to 
this. Gu is the wife of ousted 
Chinese politician Bo Xilai, 
whose corruption trial begins 
this week. 


Research fraud 

A researcher at Leiden 
University Medical Center 

in the Netherlands has been 
fired for committing scientific 
fraud, the centre announced 
on 14 August. Annemie 
Schuerwegh, who worked in 
the rheumatology department, 
admitted manipulating data 
included in a study published 
in Proceedings of the National 
Academy of Sciences in 2010, 
says a report from the centre. 
She went into the laboratory 
outside office hours and added 
mouse antibodies to tubes of 
human blood samples. The 
centre will withdraw the article 
and another paper, and has 
halted a clinical trial based in 
part on the fraudulent data. 


> NATURE.COM 
For daily news updates see: 
www.nature.com/news 
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WATER MANAGEMENT 


Forecasts turn tide on silt 


New York pioneers system to protect drinking water from adverse weather events. 


BY JEFF TOLLEFSON 


hen Hurricane Irene hammered 

the eastern United States in August 

2011, floods sent a glut of silt into 
New York City’s drinking-water system. The 
turbid waters rushed more than 100 kilome- 
tres through an aqueduct from the Catskill 
Mountains to the Kensico Reservoir, the last 
stop before the slurry would have reached 
millions of taps. For more than eight months 
afterwards, the city was forced to use an envi- 
ronmentally contentious chemical to rid the 
water of silt. 

With the frequency and intensity of such 
siltation events on the rise, New York City 
is about to embark on a pioneering upgrade 
to its water system. The focus will not be on 
new dams or silt traps. Instead, starting this 
November, New York’s reservoirs will be man- 
aged by souped-up software that automatically 
incorporates short-term weather forecasts and 
seasonal climate forecasts — helping water 


managers to deal with floods and droughts. 

Hydrologists far beyond the Big Apple will 
be watching closely. New York’s programme 
depends ona streamflow-forecasting system 
developed by the US National Weather Service, 
which aims to implement the system nation- 
wide in the coming years. “This project opens 
the door to a more quantitative use of seasonal 
climate forecasts, which will help people make 
better decisions,” says Andrew Wood, a hydrol- 
ogist at the National Center for Atmospheric 
Research in Boulder, Colorado. Australia is 
now experimenting with a similar forecasting 
system, and the European Commission Joint 
Research Centre has also rolled out a flood- 
prediction system. 

But New York will be one of the first cities 
to connect the forecasts 
to a water-management 


system. Its immedi- Read morein 
ate goal is to manage _ Nature's special on 
storm runoff to meet  waterresources: 


water-quality standards 


without adding costly new infrastructure to 
the system, a network of 19 reservoirs and 3 
lakes that collectively hold more than 2 tril- 
lion litres of water. Under normal circum- 
stances, managers handle siltation by holding 
water in reservoirs and letting silt settle out. 
But several times since 2005, the city has had 
to let silty water flow all the way to Kensico, 
where it was then treated with aluminium 
sulphate — a chemical that causes silt parti- 
cles to coagulate and sink. State and federal 
regulators have raised concerns about build- 
up of aluminium sulphate in reservoir sedi- 
ments, and its potential effects on fish and 
aquatic organisms. 

The new system is designed to help the city 
to cope with major siltation episodes, which are 
expected to increase as the climate warms. With 
better warning of impending storms, water 
managers can drain certain reservoirs ahead 
of time and alleviate the potential for siltation. 
The software also takes seasonal forecasts into 
account, allowing managers to work out 
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> various conservation strategies if, for 
example, drier weather is expected. 

“For years, we've done this by ourselves, 
just trying to balance all of this in our 
heads,” says Jim Porter, who heads water 
operations for the city’s department of envi- 
ronmental protection. “Hopefully, we can 
predict out a little further into the future” 

The potential savings are enormous. 
Coping with the silt problem by building 
a new intake system at one reservoir or 
increasing the size of a second reservoir 
would cost between US$200 million and 
$500 million. A new filtration plant could 
run to more than $10 billion. By contrast, 
the city’s analysis suggests that an integrated 
reservoir-management system can tackle 
the problem for roughly $8 million. 

But to make it happen, the city first needs 
better streamflow forecasts. Although the 
National Weather Service makes stream- 
flow predictions, it has until now done so 
mainly by comparing current conditions 
— precipitation, soil moisture, snowpack 
and streamflow — with historical aver- 
ages, and then extrapolating the results. 
This approach assumes that streamflows 
will evolve as they have in the past under 
similar circumstances, but does not look 
ahead to future conditions. 

The incoming system — years in the 
making — combines short-term and sea- 
sonal precipitation forecasts, and adds 
those predictions into the streamflow fore- 
casts. To validate the system, the National 
Weather Service checked its predictions 
against historical data. 

New York is paying the Weather Service 
about $1 million to accelerate the process 
so that the system will be available for use 
this year. “We now have something that is 
ready for prime time,’ says John Schaake, 
a hydrologist and independent consultant 
in Baltimore, Maryland, who helped to 
develop the streamflow-forecast system. 

The system will be available at 5 of the 12 
regional US river-forecast centres, although 
it is unclear when it will become standard 
nationally. There are budget constraints, 
and each centre will have to customize the 
system. “There's a lot of interest, but the 
question is how you institutionalize that,” 
says Kevin Werner, a hydrologist at the 
Colorado Basin River Forecast Center in 
Salt Lake City, Utah. 

And it is also unclear whether oth- 
ers will follow New York’s lead and hitch 
the forecasts to a reservoir-management 
system. Demonstrating that the forecasts 
improve water management should help to 
ease doubts, says Daniel Sheer, president of 
HydroLogics Incorporated in Columbia, 
Maryland, which is providing New York 
with the reservoir-management software. 
“There will be much broader interest in the 
forecasts if we can show that they work. m= 
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Half of 2011 papers 
now free to read 


Boost for advocates of open-access research articles. 


BY RICHARD VAN NOORDEN 


earch the Internet for any research article 
published in 2011, and you have a 50-50 
chance of downloading it for free. This 
claim — made in a report' produced for the 
European Commission — suggests that many 
more research papers are openly available 
online than was previously thought. The find- 
ing, released on 21 August, is heartening news 
for advocates of open access. But some experts 
are raising their eyebrows at the high numbers. 
There has been a steady move over the past 
few years towards getting research papers that 
are funded by government money into the 
public domain, and the best estimates”” for the 
proportion of papers free online run at around 


FREEDOM ONLINE 


At least" 43% of research papers published during 
2008-11 are now free online, but the proportion 
varies by country and discipline. 


m Published in open-access journal 
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30%. But these are underestimates, argues Eric 
Archambault, the founder and president of 
Science-Metrix, a consultancy in Montreal, 
Canada, that conducted the analysis for the 
European Commission. 

The firm initially asked a team led by 
Stevan Harnad, an open-access campaigner 
and cognitive scientist at the University of 
Quebec in Montreal, to check a random sample 
of 20,000 papers published in 2008 (from the 
Scopus database of papers run by Elsevier). It 
used a program designed by Yassine Gargouti, 
a computer scientist at the same university, to 
find free articles. The team found that 32% of 
the papers that it downloaded in December 
2012 were freely available. But when Archam- 
bault’s group checked 500 of these papers man- 
ually using Google and other search engines 
and repositories, the figure rose to 48%. 

On the basis of this initial test, Science- 
Metrix applied its own automated software, or 
‘harvester; to 320,000 papers downloaded from 
2004 to 2011; the tool searches publishers’ web- 
sites, institutional archives, repositories such as 
arXiv and PubMed Central, and sites such as 
the academic networking site ResearchGate 
and the search engine CiteSeer*. 

It found that an average of 43% of articles 
published during 2008-11 are available online 
for free, with the results varying by country 
and discipline (see ‘Freedom online’). But the 
true figure is probably higher, because the har- 
vester does not pick up every free paper. When 
the incompleteness is adjusted for, the propor- 
tion of free articles from 2011 rises to about 
50%, says Archambault. 

The report “confirms my optimism’, says 
Peter Suber, director of the Office for Scholarly 
Communication at Harvard University in Cam- 
bridge, Massachusetts, and a proponent of open 
access to research. He thinks that it reflects the 
experiences of working scientists today. “When 
researchers hit a paywall online, they turn to 
Google to search for free copies — and, increas- 
ingly, they are finding them,’ he says. 

The rise of open-access journals is part of the 
explanation: the share of papers published in 
these journals rose from 
4% in 2004 to 12% by 
2011, the report found 
— agreeing with figures 
published last year by 
Bo-Christer Bjork, who 
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studies information systems at the Hanken 
School of Economics in Helsinki. 

But the number of peer-reviewed manu- 
scripts made free by other means has also 
increased, the report says. That includes 
those eventually made free — often a year 
after publication, and sometimes on a 
temporary promotional basis — by pub- 
lishers that charge for subscription. But it 
also includes manuscripts that researchers 
themselves archive online on repositories 
and personal websites. Some of the articles, 
although free to read, may not meet formal 
definitions of open access because, for exam- 
ple, they do not include details on whether 
readers can freely reuse the material. 

The report does not try to distinguish 
between types of manuscript, nor where and 
how they were posted, says Archambault. 
“The situation is so complex that it’s very 
hard to measure.’ 

Bjork says that the latest measurements 
seem to have been carefully done, although 
he adds that because he does not have details 
of the robotic harvester’s code, he cannot 
evaluate its method. “Experts on the subject 
would probably agree that the open-access 
share of papers, measured around a year 
and a half after publication, is currently at 
least 30%, he says. “Anything above that is 
dependent on ways of measuring, with this 
new study representing the highest estimate” 

The report, which was not peer reviewed, 
calls the 50% figure for 2011 a “tipping 
point’, a rhetorical flourish that Suber is not 
sure is justified. “The real tipping point is 
not a number, but whether scientists make 
open access a habit,” he says. 

Harnad thinks that the next step should 
be to obtain more accurate measures of 
when papers become free. “It’s hardly a 
triumph if articles are only accessible after a 
one-year embargo,’ he says. Greater meas- 
urement accuracy is tricky to achieve, he 
adds, because Google routinely blocks all 
robotic harvesters. He believes that research 
on the growth of open access should be 
given special concessions. 

The proportion of free online papers is 
likely to increase in the next few years. The 
European Commission says that, from 2014, 
the results ofall research funded by the Euro- 
pean Union must be open access. And in 
February, the US White House announced 
that government-funded research should be 
made free to read within 12 months of publi- 
cation (see Nature 494, 414-415; 2013). Fed- 
eral agencies are due to submit their plans 
for achieving this to the US Office of Science 
and Technology Policy by 22 August. m 
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EVOLUTIONARY GENETICS 


Soay sheep have greatest sexual fitness when they have two versions of a gene that determines horn size. 


Big horns clash with 
longevity in sheep 


Gene for small horns lowers sexual fitness but boosts lifespan. 


BY EWEN CALLAWAY 


Ipha Red 78 — a ram with horns like 
Az tusks — sired 95 lambs before 
he died at the ripe (for a ram) old age 
of nine. A gene with a role in horn growth 
explains his fertility and his longevity, finds a 
study of sheep on a remote Scottish isle. The 
work also explains how variation can persist in 
traits that offer big reproductive boosts. 
Ample horns are a ram’s ticket to reproduc- 
tive success. During the breeding season, males 
fight for access to females, and those with the 
largest horns win. But if big horns are a sexual 
asset, the genes underlying the trait should 
have become ubiquitous, says Susan John- 
ston, an evolutionary biologist at the Univer- 
sity of Edinburgh, UK, who led the research. 
Yet some male sheep have short horns or none 
at all. “From an evolutionary perspective, it 
doesn't really make sense,” Johnston says. 
Johnston's team turned to the sheep living 
on Hirta, an island 160 kilometres west of the 
Scottish mainland. The animals, a primitive 
breed called Soay (Ovis aries), are known for 
their diminutive size and their agility on cliffs. 
Two years ago, Johnston's group reported 
that a single gene, RXFP2, explains horn vari- 
ability in the sheep (S. E. Johnston et al. Mol. 
Ecol. 20, 2555-2566; 2011). One version of 
the gene, Ho’, is linked to large horns; another 
allele, Ho’, is associated with small ones. 


In the latest study, published in Nature, 
Johnston's team related the RXFP2 genes 
of 1,750 sheep to three factors: horn size, 
reproductive success and lifespan (S. E. John- 
ston et al. Nature http://dx.doi.org/10.1038/ 
nature12489; 2013). Males with one or two 
copies of the Ho’ allele had the biggest horns. 
They fathered twice as many lambs as those 
with two copies of the short-horned allele, 
averaging 3 (versus 1.6) each year, says John- 
ston. But where lifespan was concerned, rams 
with two copies of Ho” had an edge, she says, 
with a 75% chance per year of surviving the 
harsh Hirta winter, compared with a 61% 
chance for those with two long-horned alleles. 

The scientists found that rams with one ver- 
sion of each allele (heterozygotes) had the best 
of everything: they were big-horned, fecund 
and long-lived. And this explains why short- 
horned rams persist. “I’m just impressed by 
the simple elegance of this story,’ says Hopi 
Hoekstra, an evolutionary geneticist at Har- 
vard University in Cambridge, Massachusetts. 

Johnston says that to learn more, scientists 
will need to study the gene: in humans and 
mice, it is involved in sexual development and 
bone density. She adds that heterozygotes such 
as Alpha Red 78 end up with more offspring 
largely because they outlive homozygous big- 
horned males, which tend to die young. 

The ram probably wasn't winning on his 
looks. “He was quite an ugly sheep,” she says. = 
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Research minister Mihnea Costoiu and his predecessor Ecaterina Andronescu have left scientists unhappy. 


Romanian science 
in free fall 


Researchers rue the reversal of positive reforms. 


BY ALISON ABBOTT 


fter 11 years away from Romania, 
Azone biologist Ioan Ovidiu 

Sirbu thought carefully before return- 
ing home to continue his scientific career. He 
had been convinced that reforms to Romania’s 
cronyism-ridden research landscape were 
solid, particularly when, in 2011, government 
grants were for the first time ever allocated 
solely on the basis of performance. 


“With such a fair granting system, I was 
sure that I could do my research just as well in 
Romania as in Germany,’ says Ovidiu Sirbu. 
“But what happened was really disappointing” 

Just months after Ovidiu Sirbu established 
himself at the Victor Babes University of Med- 
icine and Pharmacy in Timisoara in 2012, a 
new government slashed research funding 
and unpicked the reforms, eliminating rules 
designed to establish a meritocracy. 

Ovidiu Sirbu’s disappointment is widely 


MISCONDUCT ALLEGATIONS 


shared. In April, hundreds of scientists took to 
the streets in protest, and more than 900 signed 
a petition addressed to Prime Minister Victor 
Ponta, demanding that the research budget and 
quality control be restored. The entire National 
Research Council, Romania’s main research- 
funding agency, resigned in protest (see Nature 
496, 274-275; 2013). 

With no compromise from the government 
and the council seats still unfilled, Romanian 
science is adrift. Scientists are resigned to tread- 
ing water, in the hope that the tide will turn. 

Many of Romania’ best researchers left dur- 
ing the political chaos that followed the collapse 
of communism in 1989. Butin 2011, the govern- 
ment passed a law designed to drive up stand- 
ards in education and science. Research and 
education minister Daniel Funeriu furnished 
the law with rules and regulations crafted to 
break through local power networks and ensure 
that funding and academic positions would go 
to the best people — for example by requiring 
grant applications to be reviewed by foreign 
experts, and by instituting minimum qualifi- 
cations for job candidates (see Nature http:// 
doi.org/bp7nsg; 2011). At the same time, the 
research budget was boosted by nearly half. 

But that government fell last year. Reversals 
to the reforms followed; many scientists blame 
Funeriu’s successor, Ecaterina Andronescu. 

Andronescu, who took the post last July, is 
a powerful figure in Romanian academic poli- 
tics. She was research and education minister 
in two previous governments, and was rector 
at the Polytechnic University of Bucharest until 
she stepped down under Funeriu’s conflict-of- 
interest rules. The law forbade rectors from 
being politicians, in a bid to stop academics 
using political positions to help cronies. 

Andronescu lost her ministerial post in fresh 
elections last December, but during her last days 
in office, she overhauled Funeriu’s regulations 
using three legal tools — including an ‘emer- 
gency ordinance; or decree, whose rationale, she 
declared, was a need to make standards attain- 
able to more people. This decree is currently 


Ecaterina Andronescu, Romania’s former 
research and education minister, is a 
co-author of three out of four papers that 
international experts claim were plagiarized 
and infringe copyright. The allegations — the 
latest in a series of plagiarism scandals to rock 
Romania (see Nature http://doi.org/ngq; 
2012) — were published last December on 
the scientist-run website Integru.org, which 
was set up with the aim of purging plagiarism 
from Romanian academia. The editors of 

the journals — Elsevier’s Thin Solid Films and 
Journal of the European Ceramic Society, the 


Plagiarism in politics 


IEEE International Semiconductor Conference 
and the Romanian Academy’s Romanian 
Journal of Information Science and Technology 
—were told a few days later. 

The papers, published in 2006-07, have 
still not been retracted, says David Tomanek, a 
physicist at Michigan State University in East 
Lansing, who reads Integru. He told Nature 
that the journals had not replied to requests 
for information from concerned scientists. 

Only one editor — of Thin Solid Films — 
responded to Nature’s own requests for 
information. Joe Greene, a physicist at the 
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University of Illinois at Urbana-Champaign, 
wrote in an e-mail on 28 May that Elsevier 
had applied its own investigation procedures. 
He expected the inquiry to be concluded 
“within weeks to a couple of months”. 
Andronescu, who has not responded to 
Nature’s written and telephone requests for 
comment on this particular issue, denied 
plagiarism in an interview published ina 
Romanian newspaper in May, adding that 
a representative from the Journal of the 
European Ceramic Society had called her just 
days earlier to congratulate her on a paper. A.A. 
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being discussed in parliament. Andronescu’s 
actions reversed safeguards against conflicts 
of interest and cronyism. Andronescu did not 
respond with comment on these specific con- 
cerns in time for Nature’s press deadline. 

Under the latest rules, university rectors can 
once more be members of parliament, and aca- 
demics over the retirement age of 65, including 
Andronescu herself, can hold leadership posi- 
tions at universities — previously banned to 
stop people holding on to power for too long. 
Funeriu had limited academics to supervising 
eight PhD students at a time — to stop powerful 
professors from dominating the training of the 
next generation — but that restriction has now 
been lifted. And grant applications no longer 
require review by scientists outside Romania. 

In addition, Andronescu abolished the 
requirement that professors pass a special 
exam, and loosened Funeriu’s minimum cri- 
teria for holding an academic post. Critics slam 
the new criteria as too soft, and say that they 
are distorted in some subjects — in biology, for 
example, the focus is on publication of books 
rather than of peer-reviewed papers. The loos- 
ened criteria were applied this year in appoint- 
ing 1,300 professors as part of Romania’ first 
competition for academic posts since 2009. 

Andronescu, who remains a member of par- 
liament, is now head of the senate’s education 
committee and leads her university's senate. She 
told Nature that responsibility for developing 
minimum criteria for academic appointments 
lies with the Romanian National Council for 
the Attestation of University Titles. The criteria 
then become official through ministerial order. 

Late last year, Andronescu was embroiled 
in further controversy, owing to accusations of 
plagiarism and copyright infringement in her 
research papers (see ‘Plagiarism in politics’). 

Her ministerial successor, Mihnea Costoiu, 
told Nature that all procedures for academic 
appointments had been correctly followed in 
this year’s hiring surge. He added that asser- 
tions that standards for becoming a professor 
had been lowered were “either a misinterpreta- 
tion or an uninformed assumption on the part 
of the ‘initiators’ of this theory”. 

In April, Costoiu made deep retrospective 
cuts to the basic research budget, roughly halv- 
ing the value of grants awarded in 2011 that 
were already in progress, and stalling the next 
round of already-evaluated grants. He also 
launched a new grant competition, for collabo- 
rations between research and industry, using 
laxer rules. Costoiu says that he intends ongoing 
grants to receive their full monies in later years. 

In spite of the tumult, Ovidiu Sirbu remains 
optimistic. A grant that he applied for in 2012 
finally came through this month, although it 
had been cut by about one-quarter. And he 
thinks that by staying, he can make a small 
difference to science in Romania. “One of the 
good things is that I can train people the way 
they should be trained,” he says. “That’s the 
best I can do now in this country.” m 
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Apple scab is one of several crop diseases that researchers want to beat with genetic engineering. 


BIOTECHNOLOGY 


US regulation misses 
some GM crops 


Gaps in oversight of transgenic technologies allow 
scientists to test the waters for speciality varieties. 


BY HEIDI LEDFORD 


commercial apple that could fend off apple 

scab, a devastating disease caused by the 
fungus Venturia inaequalis. In 1999, they 
finally produced a tasty variety that contained 
the Vf defence gene, bred in from an unap- 
petizing relative. Instead of dousing orchards 
with fungicides 30 times a season, farmers 
could spray the resistant crop just twice. 

But five years later, V. inaequalis had 
evolved and apples trees were becoming 
infected again. Breeders were back to square 
one. Even armed with modern breeding tech- 
niques and 15 known defence genes in the 
apple family, it would take another 40 years 
to breed a resistant strain conventionally, 
says Henk Schouten, a plant scientist at 
Wageningen University in the Netherlands. 

So instead, Schouten has joined a small 
but growing pool of academics and compa- 
nies hoping to taking advantage of the latest 
approaches in genetic engineering, while 
avoiding the lengthy and expensive burden 
of government regulation. Because he wants 
to insert DNA only from related apple varie- 
ties, Schouten argues that his product should 
not be regulated in the same way as geneti- 
cally modified (GM) crops that are engineered 


I: took scientists 85 years to breed a 


with bacterial or viral DNA. Other pioneers 
argue that the techniques they are using to 
modify plants are safer than old technologies, 
and therefore do not need regulation. In some 
cases, US regulators have agreed. 

Since 2010, the US Department of Agricul- 
ture has told at least 10 groups that their GM 
products would not require regulation (see 
‘Cropping out regulation’) — removing a 
substantial financial barrier and speeding up 
development. That has encouraged academic 
labs and small companies to pursue special- 
ity crops, such as apples, that have so far been 
ignored by biotechnology giants. 

“There are any number of companies 
exploring new techniques to produce crops 
that don't trigger regulatory oversight,” says 
Scott Thenell, managing director of Thenell & 
Associates, a consulting firm in Walnut Creek, 
California, that helps researchers to navigate 
GM-plant regulations. “And often, they are 
small or niche crops that can’t support the 
escalating costs of regulatory approval” 

The regulation of GM crops in the United 

States is based on laws 
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> the branch of the agriculture department 
responsible for overseeing GM crops, has so 
far stuck to a strict interpretation of a 1957 law 
designed to protect agriculture against plant 
pests that was co-opted in 1986 to regulate 
GM crops. At that time, GM crops were nearly 
always engineered using Agrobacterium tume- 
faciens, a bacterial pest that can insert DNA into 
plant genomes. 

In 2011, APHIS regulators announced 
that a herbicide-tolerant Kentucky bluegrass 
would not fall under their purview, because 
the lawn-and-garden company developing it 
did not use Agrobacterium or any other plant- 
pest DNA to engineer the grass. The company, 
Scotts Miracle-Gro of Marysville, Ohio, 
instead used a gene gun to fire DNA-coated 
gold particles into plant cells. Some of that 
DNA is then incorporated into the genome. 

For Greg Jaffe, director of biotechnology at 
the Center for Science in the Public Interest, a 
consumer advocacy group in Washington DC, 
the news highlighted the shortcomings of the 
US regulatory system for GM crops. “The whole 
system is a fiction,” he says. 

And some are starting to test the regulation- 
free waters. Scotts Miracle-Gro, for its part, 
has said that its bluegrass was not meant to be 
commercialized, and was just a test case to see 
how APHIS would respond. That is not the case 
for other groups that have been told that their 
GM products would not be regulated. Some 
include academic researchers, who are eager 
to avoid field-trial permits and special contain- 
ment measures, and who want to encourage 
corporate development of niche crops. 

Dennis Gray, a developmental biologist at 
the University of Florida in Apopka, is trying 
to use genes from grape varieties to engineer a 
wine grape that is resistant to Pierce’s disease 
—a condition caused by a bacterium that has 
made it difficult to grow wine grapes in the state. 
He says that the lack of regulation is encourag- 
ing researchers like him to pursue such small- 
market crops. “Little agricultural labs just don't 
have access to the infrastructure and the money 
needed to move these forward.” 

Other emerging approaches may also escape 
regulation. Sally Mackenzie, a plant biologist at 
the University of Nebraska-Lincoln, contacted 
APHIS about the high-yield offspring of a trans- 
genic sorghum grass plant — even though these 
offspring no longer contain the engineered gene. 
Mackenzie thinks that the transgene triggered 


CROPPING OUT REGULATION 


Since 2010, the US Department of Agriculture has told at least 10 groups that their genetically modified 
(GM) crops would not be regulated because a plant pest was not used to do the engineering. 
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an epigenetic change: it altered the plant's gene 
expression by changing the pattern of chemical 
groups added to its DNA rather than chang- 
ing the DNA sequence itself. In 2012, APHIS 
regulators invited Mackenzie to the organiza- 
tion’s headquarters in Riverdale, Maryland, and 
questioned her about this hypothesis. APHIS 
eventually notified her that it would not regulate 
her plants — a decision that Mackenzie says has 
accelerated her research and may allow her to 
launch a company to develop her grass variety. 

Agricultural giants Monsanto, based in 
St Louis, Missouri, and Syngenta, headquar- 
tered in Basel, Switzerland, are vying to license 
the technology. “The first thing they asked me 
was, ‘Have you been through APHIS?” says 
Mackenzie. 

Other companies are gauging their prospects 
with different DNA-modification tools, such as 
zinc-finger nucleases — enzymes that precisely 
target a region of the plant genome. In 2010, 
APHIS told Dow AgroSciences of Indianapolis, 
Indiana, that it would not regulate a herbicide- 
tolerant maize (corn) made using zinc-finger 
nucleases. Dow spokesman Garry Hamlin says 
that the company has since dropped the maize 
project, but is working with outside researchers 
to develop other crops using similar technology. 

Jennifer Kuzma, a policy analyst at North 
Carolina State University in Raleigh, says that a 
lack of regulation for the latest approaches could 
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fuel public suspicions about GM crops. “One 
could argue that the technologies are more tar- 
geted and you're doing things in a smarter way,’ 
she says. “The flip side is that they are so power- 
ful you can engineer multiple genes at one time.” 

Not all companies are embracing the poten- 
tial for freedom from regulation. Oliver Peoples, 
chief scientific officer at Metabolix, a plant-engi- 
neering company in Cambridge, Massachusetts, 
says that he would rather be regulated by APHIS 
to earn the public's trust. He notes that Agrobac- 
terium inserts genes more efficiently than the 
gene-gun method. Although zinc-fingers are 
appealing for their specificity and their ability 
to escape regulation, companies do not yet have 
much experience in working with the technique, 
or navigating the patents needed to use it. 

Schouten, meanwhile, did not skirt regulation 
for his apples after all. In April 2012, APHIS told 
him that the agency would regulate his variety 
in spite of the fact that the genes he introduced 
came from other apples. This was because he 
used Agrobacterium to insert the genes — it did 
not matter to regulators that no trace of Agro- 
bacterium DNA remained in his plants. 

Schouten is perplexed. Ifhe had used a gene 
gun, he would have inserted DNA haphaz- 
ardly and in a manner more likely to damage 
other sites in the genome — yet this remains 
the unregulated method. “To me, this is a very 
strange system,’ he says. = 
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rGG ENGINEERS 


Ina technical tour de force, Japanese researchers created 
eggs and spermin the laboratory. Now, scientists have to 
determine how to use those cells safely — and ethically. 


BY DAVID CYRANOSKI 


ince last October, molecular biologist 
Katsuhiko Hayashi has received 
around a dozen e-mails from cou- 
ples, most of them middle-aged, who 
are desperate for one thing: a baby. 
One menopausal woman from Eng- 
land offered to come to his labora- 
tory at Kyoto University in Japan in the hope 
that he could help her to conceive a child. “That 
is my only wish,’ she wrote. 

The requests started trickling in after 
Hayashi published the results of an experi- 
ment that he had assumed would be of interest 
mostly to developmental biologists’. Starting 
with the skin cells of mice in vitro, he created 
primordial germ cells (PGCs), which can 
develop into both sperm and eggs. To prove 
that these laboratory-grown versions were 
truly similar to naturally occurring PGCs, he 
used them to create eggs, then used those eggs 
to create live mice. He calls the live births a 
mere ‘side effect’ of the research, but that bench 
experiment became much more, because it 
raised the prospect of creating fertilizable eggs 
from the skin cells of infertile women. And it 
also suggested that men’s skin cells could be 
used to create eggs, and that sperm could be 
generated from women's cells. (Indeed, after 
the research was published, the editor of a gay 
and lesbian magazine e-mailed Hayashi for 
more information.) 

Despite the innovative nature of the research, 
the public attention surprised Hayashi and his 
senior professor, Mitinori Saitou. They have 
spent more than a decade piecing together the 
subtle details of mammalian gamete produc- 
tion and then recreating that process in vitro 
— all for the sake of science, not medicine. 


Their method now allows researchers to create 
unlimited PGCs, which were previously dif- 
ficult to obtain, and this regular supply of 
treasured cells has helped to drive the study 
of mammalian reproduction. But as they push 
forward with the scientifically challenging tran- 
sition from mice to monkeys and humans, they 
are setting the course for the future of infer- 
tility treatments — and perhaps even bolder 
experiments in reproduction. Scientists and 
the public are just starting to grapple with the 
associated ethical issues. 

“It goes without saying that [they] really 
transformed the field in the mouse,” says 
Amander Clark, a fertility expert at the Uni- 
versity of California, Los Angeles. “Now, to 
avoid derailing the technology before it’s had 
a chance to demonstrate its usefulness, we 
have to have conversations about the ethics of 
making gametes this way.’ 


BACK TO THE BEGINNING 
In the mouse, germ cells emerge just after the 
first week of embryonic development, as a group 
ofaround 40 PGCs’. This little cluster goes on to 
form the tens of thousands of eggs that female 
mice have at birth, and the millions of sperm 
cells that males produce every day, and it will 
pass on the mouse’ entire genetic heritage. Sai- 
tou wanted to understand what signals direct 
these cells throughout their development. 
Over the past decade, he has laboriously 
identified several genes — including Stella, 
Blimp1 and Prdm14 — that, when expressed 
in certain combinations 
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select PGCs from among other cells and study 
what happens to them. In 2009, from experi- 
ments at the RIKEN Center for Developmental 
Biology in Kobe, Japan, he found that when 
culture conditions are right, adding a single 
ingredient — bone morphogenetic protein 4 
(Bmp4) — with precise timing is enough to 
convert embryonic cells to PGCs’. To test this 
principle, he added high concentrations of 
Bmp4 to embryonic cells. Almost all of them 
turned into PGCs’. He and other scientists had 
expected the process to be more complicated. 
Saitou’s approach — meticulously following 
the natural process — was in stark contrast to 
work that others were doing, says Jacob Hanna, 
a stem-cell expert at the Weizmann Institute of 
Science in Rehovot, Israel. Many scientists try 
to create specific cell types in vitro by bombard- 
ing stem cells with signalling molecules and 
then picking through the resulting mixture 
of mature cells for the ones they want. But it 
is never clear by what process these cells are 
formed or how similar they are to the natural 
versions. Saitou’s efforts to find out precisely 
what is needed to make germ cells, to get rid of 
superfluous signals and to note the exact timing 
of various molecules at work, impressed his col- 
leagues. “There’s a really beautiful hidden mes- 
sage in this work — that differentiation of cells 
[in vitro] is really not easy,’ says Hanna. Harry 
Moore, a stem-cell biologist at the University of 
Sheffield, UK, regards the careful recapitulation 
of germ-cell development as “a triumph”. 
Until 2009, Saitou’s starting point had been 
cells taken from a live mouse epiblast — a cup- 
like collection of cells lining one end of the 
embryo that forms at the end of the first week 
of development, just before the PGCs emerge. 
But to truly master the process, Saitou wanted. 
to start with readily available, cultured cells. 


ILLUSTRATION BY VIKTOR KOEN 


That was a project for Hayashi, who in 2009 
had returned to Japan from the University of 
Cambridge, UK, where, like Saitou before him, 
he had completed a four-year stint in the labo- 
ratory of a pioneer in the field, Azim Surani. 
Surani speaks highly of the two scientists, say- 
ing that they “complement each other in tem- 
perament and in their style and approach to 
solving problems”. Saitou is “systematic” and 
“single-minded about setting and accomplish- 
ing his objectives”, whereas Hayashi “works 
more intuitively, and takes a broader view of 
the subject and has outwardly a more relaxed 
approach’, he says. “Together they forma very 
strong team indeed.” 

Hayashi joined Saitou at Kyoto University, 
which he quickly found was different from 
Cambridge. There was much less time spent on 
theoretical discussions than Hayashi was used 
to; instead, one jumped into experiments. “In 
Japan we just do it. Sometimes that can be very 
inefficient, but sometimes it makes a huge suc- 
cess,” he says. 

Hayashi tried to use epiblast cells — Saitou’s 
starting point — but instead of using extracted 
cells as Saitou did, he tried to culture them as a 
stable cell line that could produce PGCs. That 
did not work. Hayashi then drew on other 
research showing that one key regulatory mole- 
cule (activin A) anda growth factor (basic fibro- 
blast growth factor) could convert cultured early 
embryonic stem cells into cells akin to epiblasts. 
That sparked the idea of using these two factors 
to induce embryonic stem cells to differentiate 
into epiblasts, and then to apply Saitou’s previ- 
ous formula to push these cells to become PGCs. 
The approach was successful®. 

To prove that these artificial PGCs were 
faithful copies, however, they had to be shown 
to develop into viable sperm and eggs. The 


“THEY ARE SETTING 
THE COURSE FOR THE 
FUTURE OF INFERTILITY 
TREATMENTS.” 


process by which this happens is complicated 
and ill understood, so the team left the job to 
nature — Hayashi inserted the PGCs into the 
testes of mice that were incapable of producing 
their own sperm, and waited to see whether 
the cells would develop®. Saitou thought that it 
would work, but fretted. “It seemed like a 50/50 
chance; he says. “We were excited and worried 
at the same time.” But, on the third or fourth 
mouse, they found testes with thick, dark semi- 
niferous tubules, stuffed with sperm. “It hap- 
pened so properly. I knew they would generate 
pups,” says Hayashi. The team injected these 
sperm into eggs and inserted the embryos into 
female mice. The result was fertile males and 
females® (see ‘Making babies’). 

They repeated the experiment with induced 
pluripotent stem (iPS) cells — mature cells that 
have been reprogrammed to an embryo-like 
state. Again, the sperm were used to produce 
pups, proving that they were functional — a 
rare accomplishment in the field of stem-cell 
differentiation, where scientists often argue over 
whether the cells that they create are truly what 
they seem to be. “This is one of the few exam- 
ples in the entire field of pluripotent-stem-cell 
research where a fully functional cell type has 
been unequivocally generated starting from a 
pluripotent stem cell in a dish,’ says Clark. 

They expected eggs to be more complex, but 
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last year, Hayashi made PGCs in vitro with cells 
from a mouse with normal colouring and then 
transferred them into the ovaries of an albino 
mouse’. The resulting eggs were fertilized 
in vitro and implanted into a surrogate. “I knew 
it had worked,” he says, when he saw the pups’ 
dark eyes pressing through their translucent 
eyelids. 


GERM-CELL BOUNTY 
Other researchers have been able to replicate 
the process to generate laboratory-grown PGCs 
(although none contacted by Nature had used 
them to produce live animals). Artificial PGCs 
are of particular use to scientists who study 
epigenetics: the biochemical modifications to 
DNA that determine which genes are expressed. 
These modifications — most often the addi- 
tion of methyl groups to individual DNA bases 
— in some instances carry a sort of historical 
record of what an organism has experienced 
(for example, exposure to foreign chemicals in 
the womb). Ina similar way to how they work 
in other cells, epigenetic markers push PGCs to 
their fate during embryonic development, but 
PGCs are unique because when they develop 
into sperm and eggs, the epigenetic markers 
are erased. This allows the cells to create a new 
zygote that is capable of forming all cell types. 

Faults in subtle epigenetic changes are 
expected to contribute to infertility and the 
emergence of disorders such as testicular can- 
cer. Already, Surani’s and Hanna’s groups have 
used the artificial PGCs to investigate the role 
of individual enzymes in epigenetic regulation, 
which may one day show how the epigenetic 
networks are involved in disease. 

Indeed, the in vitro-generated PGCs offer 
millions of cells for scientists to study, instead 
of the 40 or so that can be obtained by dissecting 
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MAKING BABIES 


Pluripotent stem cells are extracted from early 
embryos or induced from somatic cells. These cells 
are then converted to germ-cell precursors using 
key growth factors and other signalling molecules. 


Epiblast- 


Induced like cells 


pluripotent 
stem cells 


Skin cells 


early embryos, says Hanna. “This is a big deal 
because here we have these rare cells — PGCs — 
that are undergoing dramatic genome-wide epi- 
genetic changes that we barely understand,” he 
says. “The in vitro model has provided unprec- 
edented accessibility to scientists,” agrees Clark. 


CLINICAL RELEVANCE 

But Hayashi and Saitou have little to offer to 
the infertile couples begging for their help. 
Before this protocol can be used in the clinic, 
there are large wrinkles to be ironed out. 

Saitou and Hayashi have found that although 
the offspring generated by their technique usu- 
ally seem to be healthy and fertile, the PGCs that 
these offspring generate in turn are not com- 
pletely ‘normal. The second-generation PGCs 
often produce eggs that are fragile, misshapen 
and sometimes dislodged from the complex 
of cells that supports them’. When fertilized, 
the eggs often divide into cells with three sets 
of chromosomes rather than the normal two, 
and the rate at which the artificial PGCs suc- 
cessfully produce offspring is only one-third of 
the rate for normal in vitro fertilization (IVF). 
Yi Zhang, who studies epigenetics at Harvard 
Medical School in Boston, Massachusetts, and 
who has been using Saitou’s method, has also 
found that in vitro PGCs do not erase their pre- 
vious epigenetic programming as well as natu- 
rally occurring PGCs. “We have to be aware that 
these are PGC-like cells and not PGCs,’ he says. 

In addition, two major technical challenges 
remain. The first is working out how to make 
the PGCs convert to mature sperm and eggs 
without transplanting them back into testes 
or ovaries; Hayashi is trying to decipher the 
signals that ovaries and testes give to the PGCs 
that tell them to become eggs or sperm, which 
he could then add to artificial PGCs in culture 
to lead them through these stages. 

But the most formidable challenge will be 
repeating the mouse PGC work in humans. 
The group has already started tweaking human 
iPS cells using the same genes that Saitou pin- 
pointed as being important in mouse germ- 
cell development, but both Saitou and Hayashi 
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know that human signalling networks are dif- 
ferent from those in mice. Moreover, whereas 
Saitou had ‘countless’ numbers of live mouse 
embryos to dissect, the team has no access 
to human embryos. Instead, the researchers 
receive 20 monkey embryos per week from 
a nearby primate facility, under a grant of 
¥1.2 billion (US$12 million) over five years. If 
all goes well, Hayashi says, they could repeat 
the mouse work in monkeys within 5-10 years; 
with small tweaks, this method could then be 
used to produce human PGCs shortly after. 
But making PGCs for infertility treatment 
will still be a huge jump, and many scientists 
— Saitou included — are urging caution. Both 
iPS and embryonic stem cells frequently pick 
up chromosomal abnormalities, genetic muta- 
tions and epigenetic irregularities during cul- 
ture. “There could be potentially far-reaching, 
multi-generational consequences if something 
went wrong in a subtle way,’ says Moore. 
Proof that the technique is safe in monkeys 
would help to allay concerns. But how many 
healthy monkeys would need to be born before 
the method could be regarded as safe? And how 
many generations should be observed? 
Eventually, human embryos will need to 
be made and tested, a process that will be 
slowed by restrictions on creating embryos 
for research. New, non-invasive imaging tech- 
niques will enable doctors to sort good from 
bad embryos with a high degree of accuracy’. 
Embryos that seem to be similar to normal IVF 
embryos could get the go-ahead for implan- 
tation into humans. This might happen with 
private funding or in countries with less- 
restrictive attitudes towards embryo research. 
When the technology is ready, even more 
provocative reproductive feats might be possi- 
ble. For instance, cells from a man’s skin could 
theoretically be used to create eggs that are fer- 
tilized with a partner’s sperm, then nurtured 
in the womb ofa surrogate. Some doubt, how- 
ever, that such a feat would ever be possible 
— the Hinxton Group, an international con- 
sortium of scientists that discusses stem-cell 
ethics and challenges, concluded that it would 
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be difficult to get eggs from male XY cells and 
sperm from female XX cells. “The instructions 
that the female niche is supplying to the male 
cell do not coordinate with each other,’ says 
Clark, a member of the consortium. 

Saitou used iPS cells from male mice to create 
sperm and from female mice to create eggs, but 
he says that the reverse should be possible. If so, 
eggs and sperm from the same mouse could be 
generated and used for fertilization, producing 
something never seen before: a mouse created 
by self-fertilization. Neither Hayashi nor Saitou 
is ready to try this. “We would only do this [in 
mice] if there were a good scientific reason,’ says 
Saitou. Right now he does not see one. 

The two scientists already feel some pres- 
sure from patients and Japanese funding agen- 
cies to move forward. The technique could 
be a last hope for women who have had no 
luck with IVE, or for people who had cancer 
in childhood and have lost the ability to pro- 
duce sperm or eggs. Hayashi warns those who 
write to him that a viable infertility treatment 
could be 10 or even 50 years in the future. “My 
impression is that it is very far away. I don't 
want to give people unfeasible hope,” he says. 

Patients see the end result — success in 
mice — and often ignore the years of pains- 
taking work that led to sucha technical tour de 
force. They do not realize that switching from 
mice to humans means starting again almost 
from scratch, says Hayashi. The human early 
embryo is so different from the mouse that it 
is almost “like starting over on a process that 
took more than ten years”. = 


David Cyranoski is Nature’ Asia-Pacific 
correspondent. 
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Trial unpredictability yields 
predictable therapy gains 


In decades of clinical-trial data, new treatments are better than standard ones just 
over half the time. That’s as it should be, say Benjamin Djulbegovic and colleagues. 
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> fundamentally altered without analogous 
evidence that replacement systems will, on 
average, outperform them. 


GENUINE UNCERTAINTY 

Better drugs and therapies have come about 
because people participating in phase III trials 
are willing to be randomly allocated to new 
or existing treatments. Phase III trials are 
typically the final step in evaluating treatment 
efficacy. They are usually preceded by phase I 
trials that assess how a drug is metabolized, 
excreted and tolerated, and phase II trials that 
gather preliminary data on efficacy. Although 
phases I and II can occasionally identify new 
treatments with dramatic effects, thus obvi- 
ating the need for further testing’, phase III 
trials are usually required to judge whether 
new treatments are superior to existing ones. 

On ethical as well as scientific grounds, 
RCTs should be done only when there are 
genuine uncertainties about the relative 
merits of alternative treatments’. If there 
were a high likelihood (say, more than 80%) 
that one of the treatments in a comparison 
was better than the other, it would be ethi- 
cally unsound to deny some patients access 
to the superior treatment, and even if such 
a trial got past an ethics committee, well- 
informed patients would probably refuse to 
participate. In other words, if the results were 
predictable, the system of RCTs as we know 
it would cease’. Progress in therapeutics has 
occurred precisely because science and ethics 
require that the results of individual RCTs are 
not predictable. 

Because this ‘uncertainty requirement’ — 
variously referred to as ‘equipoise’, ‘the 
uncertainty principle’ or ‘the indifference 
principle’ — is insufficiently appreciated 
by the public, patients, research funders and 
investigators, we set out to test its long-term 
impact by calculating the average likelihood 
of a proposed new treatment being superior 
to established ones”. 

We conducted an analysis of 860 published 
and unpublished phase III RCTs performed 
by academics or pharmaceutical companies 
in six consecutive series of trials with a total 
of more than 350,000 patients: four series of 
743 publicly sponsored trials over the past 
50 years’, and two series of 117 publicly and 
commercially sponsored clinical trials over 
the past 30 years’ (see “The best medicine’). 
Our results show that the probability of 
finding that a new treatment is better than 
a standard treatment is about 50-60%, 
confirming the theoretical predictions we 
made more than 15 years ago*”. 

We found that in publicly sponsored 
RCTs, the likelihood that new treatments 
would work better than existing ones ranges 
from 57% to 63% for patient survival and 
from 55% to 66% for all primary outcomes 
(such as survival without recurrence of 
disease, response to treatment, symptom 


THE BEST MEDICINE 


In just over 50% of randomized clinical trials, new treatments fare better 
than existing ones for both morbidity (A) and mortality (B). 
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frequency and measures of disability). The 
only available comparable rates for industry- 
sponsored RCTs show that, overall, new 
treatments are superior to existing treatments 
for measures of morbidity (nausea, for 
example) in 75% of trials, but similar (53%) 
for survival’. Over time, the pattern in all 
trials has converged at around 50% (probably 
because earlier studies used inferior compara- 
tors) and applies across various clinical fields 
and types of treatment®” 


MAXIMUM GAIN 

Philosophers of science have suggested that 
discovery in science happens most rapidly 
when only one or a few hypotheses are tested 
ata time®. The RCT system is paradigmatic of 
this approach. It has generated incremental 
advances that, together, translate into impor- 
tant improvements in health and lifespan. For 
example, five decades of controlled experi- 
mentation have seen cure rates for childhood 
leukaemia improve from 0% to more than 
80% (ref. 6), yet in testing, only 2-5% of novel 
treatments have provided a breakthrough. 

There is still room for improving existing 
practices for clinical trials. There is substan- 
tial avoidable waste in designing, conducting 
and reporting medical research’. For exam- 
ple, the results of only around 50% of RCTs 
are published — negative results and most 
industry trials remain hidden. The rigour of 
randomized trials can also be improved, for 
example by systematically taking into account 
all relevant previous research. 

But our results show that the development 
of new treatments has been possible because 
the trials were done when unpredictabil- 
ity was greatest — in other words, when 
there was the most to gain®”. The observed 
distribution of treatment successes is not an 
accident. There is a predictable relationship 
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between the uncertainty requirement (the 
moral principle) on which trials are based 
and the outcomes of clinical trials*. 

In summary, our retrospective view of 
more than 50 years of randomized trials 
shows that they remain the ‘indispensable 
ordeals’ through which biomedical research- 
ers’ responsibility to patients and the public 
is manifested'’. These trials may need a 
tweak and polish, but they’re not broken. = 


Benjamin Djulbegovic is professor in 

the Department of Internal Medicine, 
University of South Florida (USF), Tampa, 
and at the H. Lee Moffitt Cancer Center and 
Research Institute, Tampa, Florida, USA. 
Ambuj Kumar is associate professor at 
the Department of Internal Medicine, USF 
and at the H. Lee Moffitt Cancer Center 
and Research Institute. Paul Glasziou 

is professor of evidence-based medicine 

at Bond University, Robina, Australia. 
Branko Miladinovic is assistant professor 
in the Department of Internal Medicine, 
USE Iain Chalmers is coordinator of the 
James Lind Initiative, Oxford, UK. 

e-mail: bdjulbeg@health.usf.edu 


1. Evans, |., Thornton, H., Chalmers, |. & Glasziou, P. 
Testing Treatments: Better Research for Better 
Healthcare 2nd edn (Pinter & Martin, 2011). 

2. Micheel, C. M., Nass, S. J. & Omenn, G. S. (eds) 
Evolution of Translational Omics: Lessons Learned 

and the Path Forward (National Academies Press, 

2012). 

asziou, P., Chalmers, |., Rawlins, M. & 

cCulloch, P. Br. Med. J. 334, 349-351 (2007). 

julbegovic, B. J. Med. Philos. 32, 79-98 (2007). 

halmers, |. Br Med. J. 314, 74-75 (1997). 

julbegovic, B. et al. Cochrane Database Syst. Rev. 

0, MROOO0024 (2012). 

julbegovic, B. et al. PLoS ONE 8, e58711 (2013). 

att, J. R. Science 146, 347-353 (1964). 

halmers, |. & Glasziou, P. The Lancet 374, 

6-89, (2009). 

rederickson, D. S. Control. Clin. Trials 1, 263-267 

980). 


OVVUrFVOOD Q 


ON 


oS 
aT. 


| COMMENT | BOOKS & ARTS 


SCIENCE FICTION 


A post-pandemic 
wilderness 


Paul McEuen relishes the final instalment of Margaret Atwood’s sweeping trilogy 
about a dystopian world devastated by a ‘hot bioform’. 


decade after Margaret Atwood began 
A great dystopian tale, we have at 
last reached the end of that road. The 
Canadian novelist has taken us from Oryx 
and Crake (2003) and The Year of the Flood 
(2009) to this final instalment, MaddAddam. 
A global pandemic dominates the trilogy. 
In Oryx and Crake, a disillusioned bioengi- 
neer (Crake) unleashes a ‘hot bioform that 
kills most humans. The Year of the Flood 
revisits the pandemic through the lens of 
a religious cult called God’s Gardeners, 
whose followers try to survive the ravages 
of the pathogen. MaddAddam completes 
the saga with the story of two members of 
the cult, Toby and Zeb, as they live through 
the aftermath of the plague. In the dystopian 
tradition, the trilogy is a window on our pos- 
sible near future — in this case, one driven 
to disaster by human ingenuity gone wrong. 
As MaddAddam opens, with almost all of 
humanity having perished in Crake’s “Water- 
less Flood’, it turns out that the bioengineer 


had good reason to reboot the human race. 
Atwood paints a picture of a pre-flood night- 
mare, class-divided, corporate and hegem- 
onic. This was a world of Hunger Games-like 
death sports, rampant sexual enslavement 
and increasingly macabre genetically engi- 
neered hybrids. It begged to be wiped out. 
The surviving humans must cope with a 
number of relics of pre-flood genetic tink- 
ering. These include 
Pigoons — large, fero- 
cious pigs with near- 
human intelligence, 
originally created for 
organ transplants — 
and domesticated 
goats with human hair 
known as Mo’Hairs. AT Wop D 
Also surviving is a = 
small group of human- 


: MaddAddam 
oids called Crakers, — \japcapez atwoop 
so-named for their Bloomsbury 


creator and genetically Publishing: 2013. 
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modified to be polyamorous innocents with 
a predilection for eating kudzu (an invasive 
plant). These are the meek whom Crake 
would have had inherit the Earth, but they 
face many dangers. The remaining humans, 
especially Toby and Zeb, protect them from 
the Pigoons and a pair of murderous death- 
game survivors who have already raped and 
killed some of their clan. 

As time passes, the Crakers begin to show 
signs of culture. They sing songs, beatify 
their now-dead creator, and hunger for 
more myths and stories about their origins. 
Toby, the book’s main protagonist, provides 
these as best she can, and we watch with 
hope and dread as she spins child-like tales 
for the Crakers out of the unseemly facts of 
the Flood. 

Many of these stories are told in flashback, 
particularly the full story of Adam and Zeb, 
who are in some ways the moral poles of 
MaddAddam.. It is a biblical tale of grace 
and punishment, false idols and vengeance; 


ILLUSTRATION BY JOHN RIORDAN 


but Atwood keeps the morality multifaceted, 
making a case for both pacifism and, when 
absolutely necessary, murder. 

Technology is the apple in the garden. 
In the pre-flood world, it evolved faster 
than it could be assimilated. Technology 
overwhelmed its creators, preying on their 
basest instincts and enslaving and degrading 
them. Plucked from the tree, it spread and 
destroyed. 

It is a pattern that threatens to repeat itself 
with the Crakers. Language, Atwood main- 
tains, was humankind’ first technology, and 
one of the most oddly chilling scenes occurs 
when the Crakers take the first bite of the 
apple. Toby is teaching one of the Crakers 
—a young boy named Blackbeard — about 
writing. The innocent Blackbeard refuses 
to accept the idea that pieces of the sensual 
world around him can be captured in lines 
on paper. Toby persists, showing the boy his 
name on a page. “This is how your name 
begins. B. Like bees. It’s the same sound.” But 
Blackbeard replies “That is not me,” adding 
“Tt is not bees either” 

Blackbeard learns in the end. He has 
tasted the fruit of the tree. But language is 
shown to be a saviour too. The secret to a 
new beginning for Toby, Zeb and the Crakers 
lies in forging deep links between the experi- 
ences of the humans and the Crakers, as well 
as the Mo Hairs, bees and even Pigoons. This 
is how they start the world anew: as a process 
of weaving different languages and under- 
standings of the world into a unified tapes- 
try. Atwood shows us that what is missing 
in the fast-evolving technological world is a 
constant awareness of the link between the 
iPad and the exploited worker in China, or 
the hamburger on the plate and the factory- 
farmed cow. 

Will Atwood’s imagined future be our own? 
Some elements of it will undoubtedly happen. 
Bioengineered meats are a staple in Atwood’s 
pre-flood world, and earlier this month a 
bovine stem-cell hamburger created by Mark 
Post, a tissue engineer at Maastricht Univer- 
sity in the Netherlands, was cooked and eaten. 
Will our technologies swallow us? The book's 
palindromic title suggests as much: disastrous 
ends yoked to new beginnings, with one flow- 
ing into the other in a never-ending cycle. But 
MaddAddam also tells us, even in the face of 
a disaster, to persevere. Atwood’s book is a 
warning but also, in its final accounting, a 
hopeful meditation on the cycle oflife, death 
and the possibility of life anew. m 


Paul L. McEuen is the John A. Newman 
Professor of Physical Science at Cornell 
University, New York, and director of the 
Kavli Institute at Cornell for Nanoscale 
Science. His scientific thriller Spiral was 
named Best Debut Novel of 2012 by the 
International Thriller Writers association. 
e-mail: pmceuen@gmail.com 


Books in brief 


Scarcity: Why Having Too Little Means So Much 
Se Sendhil Mullainathan and Eldar Shafir ALLEN LANE (2013) 
Scarcity Two scientists reveal that scarcity — “having less than you feel you 
ees to a need” — is a central factor in a raft of societal challenges. Economist 
== Sendhil Mullainathan and psychologist Eldar Shafir posit that when 
we lack money or attention, for example, we obsess about it, leaving 
. us little mental capacity to plan, meet other needs or practise self- 
| control. We can become entrapped and eventually derailed by a 
vicious cycle. By reframing the dynamic as a mindset rather than a 
y human failing, Mullainathan and Shafir train a new lens on chronic 
obesity, endemic poverty and desperate loneliness. 


Deep Sea and Foreign Going: Inside Shipping, the Invisible 
Industry That Brings You 90% of Everything 

Rose George PORTOBELLO Books (2013) 

Some 746 million bananas (“one for every European”) can fit into 
the largest container ship, notes journalist Rose George. About 
100,000 cargo carriers ply the world’s oceans, transporting 90% of 
our stuff. Yet these metallic Moby Dicks criss-crossing the lawless 
reaches of international waters can be hotbeds of crime, magnets 
for piracy and nemeses for sea life. Travelling with George on the 
Maersk Kendal from Felixstowe in the United Kingdom to Singapore, 
we are regaled — and horrified — by her salvos of facts. Riveting. 


a Five Days at Memorial: Life and Death in a Storm-Ravaged 


1 8 Ym a apy Hospital 
f | y E Sheri Fink CROWN (2013) 
l A y My Medical ethics and disaster management take centre stage in this 
| T harrowing chronicle of a hospital besieged by Hurricane Katrina. 
MEM Pulitzer-prizewinning journalist Sheri Fink tells how for five days in 
OR J ‘| August 2005, a botched evacuation left hundreds trapped in the 
SHER) tlh } hot, increasingly filthy Memorial Medical Center in New Orleans. A 
s: hy M.| ; handful of doctors and nurses were then alleged to have injected 
SJ some of the severely ill with lethal drug doses. Fink reports on the 


ensuing nightmare with clarity and nota little compassion. 


The Secret World of Sleep: The Surprising Science of the Mind at 
Rest 
Penelope A. Lewis PALGRAVE MACMILLAN (2013) 
Storer The sleeping brain is not at rest: so reveals neuroscientist Penelope 
Lewis in this nippy primer on the biology and behaviour associated 
with snoozing. There is much to fascinate, such as the beneficial 
synaptic clear-outs enacted by slow-wave sleep, and the ascending 
reticular activating system — brainstem ganglia that send 
neurotransmitters to the rest of the brain to signal that it is time 
— to wake up. From the latest on narcolepsy to the sleep-inhibiting 
qualities of smoked meat, this is wide-awake science. 


the 


World 


Old Man River: The Mississippi River in North American History 
Paul Schneider HENRY HOLT (2013) 

It has been a bath for mammoths, a road for steamboats and 

a headache for engineers. The mighty Mississippi is a river that 
defines a nation, its tributaries branching out across the United 
States from Montana to Pennsylvania. In his natural and cultural 
history, Paul Schneider takes us from its origins 200 million years 
ago to its dammed and polluted present. His vast cast of heroes and 
eccentrics includes nineteenth-century showman Albert Koch, who 
haphazardly assembled fossils dug from Mississippi mud. Barbara Kiser 
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Correspondence 


Antibiotics: support 
US policy change 


Asa microbiologist and member 
of the US Congress, I applaud 
your call to action on the overuse 
of antibiotics in agriculture 
(Nature 499, 379, 394-396, 
398-400; 2013). I have been 
fighting since 1999 to pass the 
Preservation of Antibiotics 

for Medical Treatment Act 
(PAMTA), which would ban the 
use of eight classes of medically 
important antibiotics in 
agriculture, with exceptions for 
treating sick animals. 

In the United States, antibiotics 
are often distributed at sub- 
therapeutic doses to healthy 
farm animals to compensate for 
crowded and unsanitary living 
conditions or to promote growth. 

In June, science ministers 
from the G8 nations discussed 
antibiotic resistance and 
committed to clamping down on 
overuse of antibiotics in health 
care, farming and fisheries. It is 
only through such coordinated 
international action that we can 
begin to hold back the tide of 
antibiotic-resistant bacteria. 

Policy-makers need help, as 
you point out. I urge US readers 
to take a stand on this issue and 
ask their representatives and 
senators to co-sponsor PAMTA 
in the House of Representatives 
or the Preventing Antibiotic 
Resistance Act in the Senate (see 
www.louise.house.gov). 

Louise M. Slaughter Washington 
DG, USA. 
eric.walker@mail.house.gov 


Antibiotics: collect 
more US data 


The paucity of data on antibiotic 
use in livestock and poultry 

in the United States makes it 
hard for scientists to assess the 
relationship with antibiotic 
resistance (Nature 499, 398-400; 
2013). More comprehensive 
data need to be collected and 
made freely available to bring 
the United States in line with 
countries such as Denmark, 


where antibiotic use can be 
traced to individual animals. 

The only US antibiotic data 
available are the sales figures that 
drug companies report to the 
Food and Drug Administration 
(FDA), which are published as 
total sales for each antibiotic 
class. Such broad aggregated 
data are of limited value, beyond 
confirming the extensive use of 
antibiotics in animals reared 
for food. 

In February 2011, the Center 
for a Livable Future at the Johns 
Hopkins University in Baltimore, 
Maryland, and the Government 
Accountability Project (GAP) 
in Washington DC attempted to 
obtain more detailed antibiotics 
data from the FDA under the 
Freedom of Information Act. The 
FDA denied the request, claiming 
that these commercial data are 
confidential. In December 2012, 
GAP sued the FDA for access to 
the data (the case is ongoing). 

Given that the misuse of 
antibiotics erodes their efficacy, 
there is an urgent need for 
greater transparency over their 
use. We contend that the FDA, 
as a public-health agency, is 
responsible to the public, not 
to the industry it regulates. It is 
imperative that more antibiotic 
data be released so that evidence- 
based public-health policies 
can be developed to combat 
antibiotic resistance. 

Robert S. Lawrence, Keeve E. 
Nachman, Tyler J. Smith Johns 
Hopkins Center for a Livable 
Future, Baltimore, Maryland, 
USA. 

tylsmith@jhsph.edu 


Fukushima: unpaid 
soil-research effort 


We and other particle 

physicists working voluntarily 
in Fukushima, Japan, where 
people were evacuated in 2011 
because of the nuclear accident, 
are neither “opportunistic” nor 
“adventurous’, as you quote (see 
Nature 499, 265-266; 2013). We 
seek only to use our expertise 

to find a way to reduce the 
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radioactive contamination of the 
area’s soil. 

We give up our weekends to 
work independently on analysing 
soil samples — a gruelling task 
in the winter months. We give 
lectures on radiation to those 
forced to leave their homes, so 
that they can better understand 
their plight and our efforts to 
remedy it. Our goal is to enable 
local farmers livelihoods 
eventually to be restored. 

Tokio Kenneth Ohska, Hiroshi 
Iwase High Energy Accelerator 
Research Organization, Tsukuba, 
Japan. 

hiroshi.iwase@kek.jp 


Fukushima: ‘ecolab’ 
branding insensitive 


As the organizers of a 
symposium on the genetic 
effects of radiation following 
the Fukushima disaster — held 
at this year’s annual meeting 
of the Society for Molecular 
Biology and Evolution — we 
object strongly to the headline 
of your report ‘Fukushima offers 
real-time ecolab’ (Nature 499, 
265-266; 2013). 

In our view and those of many 
others from Japan and elsewhere 
who have communicated their 
objections to us, it conveys a lack 
of empathy among researchers 
for the suffering of the people 
and animals affected by the 
Fukushima power-plant disaster. 

Scientists working on the 
consequences of the catastrophic 
events of March 2011, including 
the symposium panellists and 
ourselves, always take into 
primary consideration the 
pain of people in Fukushima. 
The researchers would never 
insult them by branding them 
or their natural environment as 
experimental material. 

Your headline does not reflect 
the aims of our symposium or of 
the panellists’ research 
Tomoko Y. Steen Georgetown 
University, Washington DC, USA. 
tys8@georgetown.edu 
Marta L. Wayne University of 
Florida, Gainesville, USA. 


Curb indigenous 
fears of REDD+ 


One of Panama’ leading 
traditional indigenous 
authorities, the Guna General 
Congress, in June banned 
a project aimed at reducing 
emissions from deforestation and 
forest degradation (REDD+). 
The Congress, which controls 
about 7% of Panama’s primary 
forests, went further, forbidding 
organizations in the Guna Yala 
territory from engaging in 
REDD+ activities, and walked 
out of REDD+ discussions. We 
believe that this crisis stems from 
a failure to build REDD+ capacity 
for indigenous people at all levels: 
it is time to pay more than lip 
service to their full and effective 
participation in REDD+. 
REDD+ started well in 
Panama. The country put the 
rights of indigenous peoples on 
the agenda of the United Nations 
Framework Convention on 
Climate Change, and REDD+ 
project promoters complied 
with consent procedures of 
the Guna General Congress. 
Panama's National Coordinating 
Body of Indigenous Peoples 
(COONAPIP) drafted a plan in 
2011 for comprehensive REDD+ 
capacity-building efforts in each 
indigenous territory. This would 
have stimulated debates about 
fears that REDD+ might threaten 
traditional land uses and rights, 
as well as possible ways forward. 
Knowledge transfer is the best 
antidote for the fear of REDD+. 
The plan failed to receive 
UN funding. COONAPIP 
withdrew from the UN-REDD 
programme in February and 
called on indigenous peoples 
globally to proceed cautiously on 
REDD-related matters. If this fear 
of participation spreads beyond 
Guna Yala, the programme could 
be jeopardized in other Latin 
American countries. 
Catherine Potvin, Javier 
Mateo-Vega McGill University, 
Montreal, Canada; and 
Smithsonian Tropical Research 
Institute, Panama. 
catherine.potvin@megill.ca 


OBITUARY 


Michael John Morwood 


(1950-2013) 


Rock-art archaeologist and driving force behind the ‘Hobbit’ discovery. 


ichael John Morwood — Mike 
to his mates — was at heart what 
Australians would call a larrikin. 


Shaded by his battered bushman’s hat on his 
frequent trips in the field, he wasted no time 
with small talk, and his quizzical stare gave 
him a slightly zealous demeanour. But his 
vision, intuition and leadership resulted in 
the 2003 discovery of Homo floresiensis, a 
species of archaic human identified from 
fossils found in eastern Indonesia. Given 
the type specimen’s short stature, Morwood 
dubbed it “Hobbit after the fictional inhab- 
itants of Middle-earth in J. R. R. Tolkien's 
Lord of the Rings. 

Born in 1950 in Auckland, New Zealand, 
Morwood died of cancer in Darwin, Aus- 
tralia, on 23 July, en route to Indonesia. He 
was fascinated by Aboriginal rock art, the 
origins of the first Australians and their 
ancient connections with southeast Asia. 
As a state archaeologist in Queensland 
and during his PhD at the Australian 
National University in Canberra, Morwood 
pioneered studies that integrated rock art 
with artefacts recovered from excavations 
in Queensland. After joining the Univer- 
sity of New England in Armidale, New 
South Wales, in 1981, he concentrated on 
sites in northern Queensland, culminat- 
ing in the monograph Quinkan Prehistory 
(Tempus, 1995). This comprehensive 
work, edited with his long-time colleague 
Douglas Hobbs, gives a multidisciplinary 
perspective on 50,000 years of human activ- 
ity in an environmental context, and set the 
tone for Morwood’s subsequent projects in 
Western Australia and Indonesia. 

Morwood served as president of the 
Australian Rock Art Research Association 
from 1992 to 2000, and in 2002 published 
Visions from the Past (Allen & Unwin). In 
this acclaimed book, he offers a continent- 
wide analysis of the rock art and archaeology 
of ancient Australia, drawing on his first- 
hand experiences spanning almost three 
decades. In 2007, he became a key figure in 
the nascent Centre for Archaeological Science 
at the University of Wollongong in New South 
Wales, where he planned and led further 
expeditions to Indonesia, and mentored the 
next generation of archaeologists. 

In the mid-1990s, evidence of ancient 
contact between people of northern Aus- 
tralia and Indonesia led Morwood to launch 
a series of projects on Flores, an Indonesian 
island separated from mainland Asia by 


several sea crossings. Morwood ventured 
first to central Flores, where in the 1960s 
the Dutch priest and amateur archaeologist 
Theodorus Verhoeven had contentiously 
reported finding 750,000-year-old stone 
tools. Morwood collaborated with research- 
ers at Pusat Survei Geologi, the geological 
survey institute in Bandung, Indonesia, 
and with Australian geochronologists to 
prove that Verhoeven’s conclusions were 
essentially correct. His team subsequently 
extended the antiquity of tool-making on 
Flores to 1 million years (possibly by ances- 
tors of H. floresiensis) — the earliest evidence 
for humans east of Wallace’s line, which 
separates the fauna of Asia and Australia. 
Next, Morwood revisited another of 
Verhoeven sites, a limestone cave in western 
Flores, looking for traces of ancestors of the 
first Australians. His direct and dogged style 
of negotiation with archaeologists at Pusat 
Arkeologi Nasional, the national archeo- 
logical centre in Jakarta, succeeded where 
previous Australian archaeologists had failed, 
and he jointly led an Australian—Indonesian 
team to the cave in 2001. Two years later, after 
Morwood had returned to Java, leaving one of 
us (T.S.) to complete excavation of a 6-metre- 
deep hole at the site, Hobbit was discovered 
there unexpectedly. The news was relayed 
immediately to Morwood, who arranged safe 
transport of the fragile fossils to Jakarta for 
detailed study. A political and scientific saga 
unfolded, colourfully chronicled in Morwood 
and Penny van Oosterzee’s popular-science 
book, The Discovery of the Hobbit (Random 
House, 2007). As other researchers jostled for 
the skeleton, bones were damaged, altered 
and sampled for DNA, resulting in acrimo- 
nious accusations and strenuous denials of 
incompetence and ethical misconduct. 


Published in Nature in 2004, the discovery 
attracted intense scientific scrutiny and 
media coverage, propelling Morwood into 
the spotlight — sometimes reluctantly. That 
a 1-metre-tall human species with archaic 
features had survived until after Homo 
sapiens reached southeast Asia and Australia 
was a finding welcomed enthusiastically by 
some, but viewed sceptically by others. In 
response to concerns that H. floresiensis was 
not a new species, but a diseased member of 
H. sapiens, Morwood invited other human- 
evolution researchers to study and sample the 
fossils. This spirit of open enquiry epitomized 
his integrity and insistence on transparency. 
H. floresiensis is now generally accepted as 
a valid species, but its evolutionary lineage, 
geographical distribution and period of exist- 
ence remain open questions that Morwood 
spent his final decade striving to answer. 

Morwood prized his long-term collabo- 
rations in Indonesia and was always deeply 
respectful of indigenous communities and 
egalitarian in his dealings, treating senior 
colleagues and students alike. He took 
great care to educate, nurture and enthuse 
archaeology students and young researchers 
in Australia and Indonesia, and he inspired 
strong loyalties in his collaborators. 

Field trips with Mike were memorable 
because of his insatiable curiosity and thirst 
for discovery (especially of another human 
species unknown to science), his whimsi- 
cal sense of humour (Mike only half-joked 
that Hobbit should have been named Homo 
hobbitus) and his delight in adding a sword 
to his swashbuckling collection. He could 
also be exasperating: Mike paid scant atten- 
tion to anything — or anyone — outside 
his immediate field of vision, and he had 
no patience for administrative paperwork 
that might impede his progress. But it was 
this single-minded drive and tenacity that 
enabled him to accomplish so much. We 
and many archaeologists in Australia and 
Indonesia will sorely miss Mike's camarade- 
rie, restless energy and zest for adventure. m 


Richard G. Roberts and Thomas Sutikna 
are at the Centre for Archaeological Science, 
University of Wollongong, Australia. T.S. is 
also at Pusat Arkeologi Nasional, Jakarta, 
Indonesia. They collaborated with Mike 
from the 1990s on projects in Australia and 
Indonesia, including the Hobbit discovery. 
e-mails: rgrob@uow.edu.au; 
thomasutikna@yahoo.com 
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OTAGO DAILY TIMES 


NEWS & VIEWS 


PLANT BIOLOGY 


Electric defence 


Herbivory and mechanical wounding in plants have been shown to elicit electrical signals — mediated by two glutamate- 
receptor-like proteins — that induce defence responses at local and distant sites. SEE LETTER P.422 


ALEXANDER CHRISTMANN & ERWIN GRILL 


he mammalian nervous system can 
| relay electrical signals at speeds 
approaching 100 metres per second. 
Plants live at a slower pace. Although they 
lack a nervous system, some plants, such as 
the mimosa (Mimosa pudica) and the Venus 
flytrap (Dionaea muscipula), use electrical sig- 
nals to trigger rapid leaf movements. Signal 
propagation in these plants occurs at a rate of 
3 centimetres per second — comparable to 
that observed in the nervous system of mus- 
sels. On page 422 of this issue, Mousavi et al.’ 
address the fascinating yet elusive issue of 
how plants generate and propagate electrical 
signals. The authors identify two glutamate- 
receptor-like proteins as crucial components 
in the induction of an electrical wave that is 
initiated by leaf wounding and that spreads 
to neighbouring organs, prompting them 
to mount defence responses to a potential 
herbivore attack. 

As sessile organisms, plants have evolved 
diverse strategies to combat herbivores. 
These include mechanical defences, such as 
the thorns found on rose bushes, and chemi- 
cal deterrents, such as the insect-neurotoxic 
pyrethrins of the genus Chrysanthemum. 
However, some plants do not invest in con- 
tinuous defensive structures or metabolites, 
relying instead on the initiation of defence 
responses on demand’. This strategy requires 
an appropriate surveillance system and rapid 
communication between plant organs. A 
key player in orchestrating these reactions is 
the lipid-derived plant hormone jasmonate, 
which rapidly accumulates in organs remote 
from the site of herbivore feeding’. 

Mousavi et al. used thale cress (Arabi- 
dopsis thaliana) plants and Egyptian cotton 
leafworm (Spodoptera littoralis) larvae as a 
model of plant-herbivore interactions. The 
researchers placed the larvae on individual 
leaves and recorded changes in electrical 
potentials using electrodes grounded in the 
soil and on the surface of different leaves. The 
leaf-surface potential did not change when a 
larva walked ona leaf, but as soon as it started 
to feed, electrical signals were evoked near 
the site of attack and subsequently spread 
to neighbouring leaves at a maximum speed 
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Figure 1 | Protective responses induced by electrical signalling. On herbivore attack, levels of the plant 
hormone jasmonate increase, triggering defence responses. Mousavi et al.' show that leaf injury, caused 
by herbivory or mechanical wounding, induces the transmission of electrical signals that are generated 
by the activity of glutamate-receptor-like (GLR) ion channels. These signals induce the formation of 


jasmonate at local and distant sites in the plant. 


of 9 centimetres per minute. The relay of 
the electrical signal was most efficient for 
leaves directly above or below the wounded 
leaf. These leaves are well connected by the 
plant vasculature, which conducts water and 
organic compounds, and is a good candi- 
date for the transmission of signals over long 
distances. 

At all sites that received the electrical sig- 
nals, jasmonate-mediated gene expression was 
turned on and initiated defence-responsive 
gene expression. Ina mutant A. thaliana plant 
lacking the receptor for jasmonate, an elec- 
trical signal was propagated but no defence 
response was elicited. Defence responses also 
failed to occur at remote sites when the trans- 
mission of the electrical signal was prevented 
by ablation of the damaged leaf before the 
signal had passed the leaf stalk. These fasci- 
nating observations clearly demonstrate that 
electrical signal generation and propagation 
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have a crucial role in the initiation of defence 
responses at remote sites upon herbivore 
attack. 

The salivary secretions of herbivores con- 
tain elicitor molecules that are recognized by 
the host plant** and that induce jasmonate- 
mediated defence responses. However, 
Mousavi and colleagues found that extensive 
mechanical wounding (in the absence of her- 
bivory) also initiated electrical signal transmis- 
sion and jasmonate biosynthesis. In addition, 
a herbivore-response gene-expression pattern 
could be artificially induced by applying elec- 
tric pulses that mimicked the plant's electri- 
cal signals. Thus, it remains unclear how the 
electrical signals are interpreted to stimulate 
jasmonate biosynthesis. 

The authors next investigated which cell- 
ular components are involved in generating 
the electrical signals, by screening A. thaliana 
plants defective in candidate ion pumps and 


channels. They found that loss of function of 
certain members of the glutamate-receptor- 
like (GLR) family of ion-channel proteins — 
some of which form calcium-ion-permeable 
channels that can be activated by agonists such 
as glutamate and serine®’ — affected wound- 
induced signal gen- 


eration. Indeed, “Electrical 
combined disruption signals evoked 
of the genesencoding near the site of 
two ofthesechannels, attack spread to 
gir3.3 and glr3.6, neighbouring 
resulted inthe elec- leqvesata 

trical wave nolonger maximum speed 
propagating after of 9centimetres 


wounding. 

Thus, it seems that 
herbivory and mechanical wounding trig- 
ger the local generation of an electrical sig- 
nal through the activity of GLRs; this signal 
then spreads to neighbouring organs where 
the biosynthesis of jasmonate is induced, in 
turn triggering jasmonate-dependent defence 
responses (Fig. 1). Several questions emerg- 
ing from this study will foster future research 
efforts. For example, how do feeding and 
mechanical wounding activate the GLRs? 


per minute.” 


ASTROPHYSICS 


Might calcium ions be involved in the genera- 
tion and maintenance of the electrical wave? 
It will also be intriguing to elucidate whether 
GLRs relay the faster electrical signalling that 
triggers movement in mimosa and the Venus 
flytrap. 

Plant wounding is also known to evoke an 
extracellular wave of reactive oxygen species 
(ROS), which propagates at a speed® compa- 
rable to that recorded by Mousavi et al. for 
the electric signals. But the authors found 
that inhibiting wound-induced ROS gen- 
eration did not substantially disrupt electric 
signalling, so it remains to be determined 
whether there is an interaction between 
wound-induced ROS signalling and electric 
signalling. 

It is interesting to note that plant GLRs are 
structurally related to vertebrate ionotropic 
glutamate receptors, which are important 
for rapid excitatory synaptic transmission in 
the nervous system. Insect feeding on leaves 
has also been shown to generate an electric 
wave by a continuous relay of cell-membrane 
depolarizations* that is reminiscent of 
excitatory signal propagation in animals. 
Together, these findings imply that ionotropic 


Twinkling stars 


A correlation between stellar brightness variations and the gravitational 
acceleration at a star’s surface has been observed that allows this acceleration to 
be measured with a precision of better than 25°. SEE LETTER P.427 


JORGEN CHRISTENSEN-DALSGAARD 
CC winkle, twinkle little star, how 
I wonder what you are.” Given 
the wording of this old nursery 
rhyme, it is highly satisfying that Bastien 
et al.' (page 427 of this issue) find that a 
star’s twinkle may hold the key to deter- 
mining its properties. The authors used 
data from NASA’s Kepler space mission 
to show that accurate measurements of 
variations in a star’s light reveal informa- 
tion about the acceleration of gravity at the 
star’s surface. This result is significant for 
the characterization of stars, and in par- 
ticular for the determination of radii of 
stars hosting planetary systems. 
Essentially all knowledge about distant 
stars derives from observation of the light 
emitted by their outer layers. Therefore, 
the properties of these layers are central 
to the study of stars. These properties have 
conventionally been obtained from ana- 
lysis of stellar spectra, but the gravitational 
acceleration (g) has proved notoriously 
difficult to nail down, and the resulting 
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Figure 1 | Stellar variability. Bastien et al.' describe the 
brightness variations of Sun-like stars in terms of variations 
on timescales of days (total range; here about 1 part per 
thousand) and of variations on timescales shorter than 
8 hours (flicker; here roughly 0.03 p.p.t.). The red curve 
shows the result of smoothing the blue curve with an 8-hour 
running mean. 
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glutamate-receptor-type proteins must have 
existed before animals and plants diverged. 
These ancestral proteins might already have 
functioned in the generation of long-distance 
warning signals to elicit the timely initiation 
of protective responses. = 
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uncertainty about this quantity has substantial 
effects on the measurement of other proper- 
ties, such as temperature and chemical com- 
position. 

Analyses of variations in stellar brightness 
caused by stellar oscillations (asteroseismol- 
ogy), particularly those based on the spectac- 
ular data from the Kepler mission’, provide 
precise determinations of g but require exten- 
sive observations and complex analysis, which 
are available for only a limited number of 
stars. However, stellar oscillations are 
not the only factor that contributes to 
variations in brightness. Bastien and col- 
leagues show that g is also reflected in 
these variations. 

One of the properties of a star’s bright- 
ness variations measured by Bastien et al. 
is the total range of the variations. This 
includes variations on timescales of days 
that may have a number of causes, such as 
the rotation of large starspots across the 
disc of the star. In addition to this total 
range, the authors characterize the vari- 
ations in terms of what they call ‘flicker’ 
— variations that occur on timescales 
shorter than eight hours (Fig. 1). In the 
Kepler data, they identify a substantial 
fraction of Sun-like stars that have a low 
range, defining what they dub ‘flicker 
floor’ (see Fig. 3 in the paper). Bastien 
et al. find that, for a subset of these flicker- 
floor stars whose precise values of g are 
known from asteroseismology’, there is 
a close correlation between flicker and g, 
with the amplitude of the flicker increas- 
ing with decreasing g. For other stars on 
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the flicker floor, this correlation provides a 
means of determining g with a precision of 
better than 25% — between two and three 
times better than the precision obtained from 
conventional observations. 

As Bastien et al. note, the general depend- 
ence of brightness variations on stellar prop- 
erties can be described in terms of stellar 
evolution. Young stars tend to have stronger 
magnetic activity, with many starspots, and 
hence display a large total range of variations. 
With increasing age, this activity diminishes 
and the stars settle on the flicker floor. As the 
stars grow older their radii increase, leading to 
a lower ganda higher flicker. 

Bastien and colleagues demonstrate how 
flicker can be used to measure g, but do not 
provide a detailed analysis of the physical 
nature of the flicker. The stars for which the 
investigation was carried out have outer con- 
vection zones, in which energy is transported 
to the surface through the motion of gas. In 
the Sun, this transport is visible in granula- 
tion — a time-varying pattern of small-scale 
brighter and dimmer regions on the solar sur- 
face that reflects hot, rising and cooler, sinking 
gas pockets. Granulation also leads to minute 
variations in the total solar brightness. 

The authors’ study indicates that stel- 
lar granulation is a contributor to flicker. 
Indeed, the spatial scale and other properties 
of granulation depend on g (ref. 3), with lower 
g resulting in a larger scale and thus prob- 
ably causing larger brightness variations on 
timescales relevant to flicker, in agreement 
with the correlation that the authors found. 
Further support for the relationship between 


BIOTECHNOLOGY 


granulation and flicker comes from other 
Kepler observations and modelling of red- 
giant stars’. Brightness variations caused by 
granulation are expected in all the stars con- 
sidered by the authors, hence defining a lower 
limit to the variations — the flicker floor. A 
better physical understanding of the origin of 
flicker might allow the observed brightness 
variations to be used to probe the dynamics 
of the outermost stellar layers. The resulting 
improved stellar modelling could, in turn, 
improve the accuracy with which g can be 


determined. 
“Studvin The very small 
the sc g amplitude of flicker 
of stars does makes it essentially 
° unobservable using 
Mess abd be ground-based tele- 


scopes, owing to 
the effect of Earth’s 
atmosphere. How- 
ever, the ability to measure flicker with Kepler 
observations will be valuable in the continuing 
analysis of Kepler data on exoplanets, which 
are detected through the slight dimming 
of a star’s light as a planet transits, or passes 
in front of it. An accurate determination 
of g from flicker greatly aids the analysis of 
spectroscopic observations used to infer the 
chemical composition of planet-hosting stars, 
and so advances our understanding of planet 
formation’. Furthermore, planetary transit 
observations provide a measurement of only 
planetary radius relative to stellar radius, and 
uncertain information about stellar radii ham- 
pers the characterization of the planets. With 
knowledge of g from flicker, as well as of the 


what they are.” 


Programming genomes 


with light 


The combination of two techniques — optogenetics and genome editing using 
engineered nucleases — now provides a general means for the light-controlled 
regulation of any gene of interest. SEE LETTER P.472 


ANDREAS MOGLICH & PETER HEGEMANN 


two sparkling biological technologies 

developed over the past decade. The first 
is optogenetics’, the process by which light- 
responsive proteins are engineered into tar- 
get cells and used to regulate their activity. 
The second is the use of sequence-targeted 
DNA-cleaving enzymes to specifically alter 
the genome. By uniting these techniques, 
the authors present a versatile method for 
targeted control of gene transcription and 


| n this issue, Konermann et al.‘ combine 


genomic modifications. 

Optogenetics can be used in cells and in liv- 
ing organisms, and allows cellular regulation 
using light of different colours, intensities and 
duration in a graded, non-invasive, reversible 
and spatiotemporally precise fashion. Most 
optogenetic applications so far have relied on 
the use of light-sensitive ion channels and ion 
pumps to modulate the voltage dynamically 
across biological membranes, in particular to 
elicit action potentials in neurons. 

Nature offers a plethora of other processes 
that are regulated by light, such as those 
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surface temperature and composition of the 
star, fits of stellar models to these quantities 
can be used to obtain a more precise value of 
the stellar radius, and hence of the planetary 
radius. 

Beyond Kepler, the authors’ technique will 
be valuable for NASA’s planned Transiting 
Exoplanet Survey Satellite (TESS), which is 
slated for launch in 2017. TESS will carry out 
an all-sky survey for extrasolar planetary sys- 
tems by monitoring at least half a million stars, 
and will require efficient methods to charac- 
terize the target stars. The same applies to the 
European Space Agency’s Planetary Transits 
and Oscillations of Stars (PLATO) exoplanet 
mission, should it be selected for launch in 
2022-24. 

Therefore, Bastien and colleagues’ analysis 
holds great promise for measuring stellar prop- 
erties and understanding the complex dynam- 
ics of the outermost layers of stars. Studying 
the twinkling of stars does indeed help us to 
understand what they are. = 
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controlled by photoreceptors in plants and 
microorganisms. With only a few exceptions, 
these processes are intricately tied to their 
organism of origin, and their deployment in 
others is challenging. However, their existence 
suggests that optogenetics could be extended 
to regulating enzyme activity or could be used 
to induce more persistent effects by targeting 
DNA. Indeed, natural photoreceptors have 
provided design blueprints for the engineering 
of several biological systems with customized 
light responses’. 

A particularly versatile strategy uses photo- 
receptors that associate with other proteins in 
a light-regulated process. In terms of perfor- 
mance, robustness, response time and ease of 
use, the blue-light-responsive protein crypto- 
chrome 2 and its light-induced interaction 
with its partner protein CIB1 (ref. 4) currently 
have the edge over alternative photodimeriz- 
ers such as the red-light-responsive phyto- 
chrome-PIF pair®. Several laboratories have 
successfully modulated gene transcription 
using photodimerizing proteins**. Initially, 
these optogenetic systems were directed to 
specific DNA sites by coupling to the DNA- 
binding part of the transcriptional-activator 


protein Gal4 (ref. 5). Light exposure recruited 
an interacting protein that was coupled to 
the activation domain of Gal4, thus initiat- 
ing transcription. Although these systems 
are powerful’®, they are inherently limited 
because they use DNA-binding domains 
with fixed target-sequence specificity, and 
because target genes have to be introduced 
into the host genome as exogenous DNA 
templates. 

In parallel with the introduction of opto- 
genetics, DNA-engineering strategies have 
been developed that can target unique sites 
among the billions of nucleotides in a genome. 
Early versions of such approaches” were based 
on zinc-finger and transcription-activator- 
like effector (TALE) proteins, which contain 
repetitive amino-acid sequences that recognize 
single DNA nucleotides or nucleotide triplets, 
and introduce double-stranded DNA breaks 
on binding to these sequences. The DNA- 
repair process that is activated in response to 
this damage can be used to introduce novel 
genetic elements at the site. However, adjust- 
ing the sequence specificity of zinc-finger and 
TALE proteins entails the laborious produc- 
tion of customized proteins. 

A more recently developed approach, called 
the CRISPR-Cas system'*"', overcomes this 
limitation. In this system, an endonuclease 
enzyme that induces a double-strand break 
is used with a sequence-specific guide RNA 
molecule — simply replacing the guide RNA 
is sufficient for sequence adaptation. The 
CRISPR-Cas technology stands to make 
engineering of zinc-finger and TALE proteins 
obsolete and to render genome engineering 
fast, efficient and inexpensive. 

Capitalizing on their expertise in both 
optogenetics and genome engineering, Kon- 
ermann et al. have overcome the sequence- 
restriction problem of earlier light-activated 
transcription-modulation approaches in 
their light-inducible transcriptional effector 
(LITE) system (Fig. la). The system uses a 
TALE protein coupled to cryptochrome 2, and 
CIB1 coupled to the transcriptional-activator 
protein VP64. This combination results in 
the cellular transcriptional machinery being 
recruited to the genomic site defined by the 
TALE protein when blue light is absorbed. 
The authors showed in vitro that, following 
light exposure, site-specific gene expression 
was enhanced by 10-20 times compared with 
darkness, and they convincingly validated 
the technology in mouse neurons and in the 
brains of conscious mice by monitoring light- 
mediated transcription of the genes Grm2 and 
Neurog2. 

The LITE approach has several favourable 
characteristics. First, the light-responsive 
molecules of cryptochrome 2 are the chromo- 
phores flavin-adenine dinucleotide and 
methyltetrahydrofolate, which are universally 
abundant. Second, induction of transcrip- 
tion occurs within minutes of light exposure. 
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Figure 1 | Modular control of genome function. a, Konermann and colleagues’ LITE system’ uses 
transcription-activator-like effector (TALE) proteins — which specifically bind to unique DNA sequences 
— coupled to the light-sensitive photoreceptor cryptochrome 2 (CRY2). On light exposure, a complex 
of CRY2’s interaction partner CIB1 coupled to the transcriptional-activator protein VP64 is attracted to 
CRY2, and VP64, in turn, attracts the cellular transcriptional machinery, initiating transcription at the 
target site. This reaction is reversible. b, The system is highly versatile because the various components 
can be interchanged. DNA targeting can be achieved using Gal4, zinc-finger proteins or the CRISPR- 
Cas system. Proteins that respond to light of different colours (such as red light for the phytochromes 
A/B (PhyA/B) and their interaction partners PIF3/6) or small molecules (such as rapamycin for the 
FKBP-FRB interaction pair) can be used as the sensor and recruitment molecules. Also, different output 
molecules can be used for various effects, including recombination (using endonucleases such as FokI), 
transcriptional repression (through the protein KRAB") or histone modification (using enzymes that 


elicit epigenetic effects). 


Third, the response can be graded with light 
dose and is fully reversible after light retrac- 
tion. Finally, because light can be applied non- 
invasively, its use is not restricted to cultured 
cells but extends to freely moving animals, 
as established for conventional optogenetic 
tools”. 

Great power lies in the modularity and 
resultant versatility of this technique (Fig. 1b). 
By replacing constituent modules of LITE, the 
system can be tuned to be sensitive to light of 
different colours or to have different effector 
outputs. The authors impressively demon- 
strated this second possibility by interfacing 
LITE with various molecules that modify 
histones — the proteins around which DNA 
is wrapped. They show that their system can 
be used to site-specifically enhance histone 
methylation and acetylation — two epi- 
genetic modifications that regulate the rate of 
gene transcription. The LITE approach thus 
enriches the optogenetic arsenal with novel 
applications. 

Similarly, the TALE module of LITE can be 
exchanged for other DNA-binding modules, 


including ones based on the CRISPR-Cas 
system, as Konermann et al. demonstrate. 
Because the CRISPR-Cas system can be rap- 
idly directed to different DNA sites, this will 
allow faster fine-tuning of the efficacy of any 
LITE experiments. Thus, the combination of 
CRISPR-Cas and LITE may truly usher in 
a new era of systems biology, in which gene 
expression and epigenetic modifications can 
be manipulated at the genome level with 
supreme sequence specificity, exquisite tem- 
poral resolution and full reversibility. 

As with any new technology, there is room 
for improvement. In particular, it would be 
desirable to increase the degree of transcrip- 
tional activation by LITE. There is also the 
question of where the LITE system should be 
positioned in the genome to achieve maximum 
effect, but this can be easily addressed with the 
rapid manipulation offered by the CRISPR- 
Cas system. Even in its present implementa- 
tion, LITE represents a powerful approach 
to light-controlled genome programming. 
Given its versatility, ease of use, performance 
and potential for automation, we expect this 
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technology to be widely and rapidly taken up 


across many biological disciplines. = 
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A solid triple point 
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and atmospheric pressure, with decreasing 
pressure reducing that temperature (hence 


The observation of a triple point of coexistence between two insulating phases 
and a conducting phase in vanadium dioxide reveals physics that may help to 
unravel the role of electronic correlations in this material. SEE LETTER P.431 


DOUGLAS NATELSON 


uch of condensed-matter physics 
is concerned with thermodynamic 
phases, their properties and their 


transitions. In correlated materials, the elec- 
tron-electron and electron-lattice interactions 


result in a competition between vari- 
ous electronic, magnetic and structural 
phases. The transitions between compet- 
ing phases can reveal information about 
the underlying states that is otherwise 
difficult to obtain. On page 431 of this 
issue, Park et al.’ use a micromechanical 
device and single-crystal nanobeams to 
determine with high precision the ten- 
sile stress-temperature phase diagram 
of vanadium dioxide (VO,), an arche- 
typal correlated oxide. Their experiment 
reveals a surprising and interesting fact: 
the metal-insulator phase transition for 
which VO, is famous is in fact a triple 
point, a rare circumstance in which three 
phases (here two insulators and a metal) 
can coexist. The experiment also deter- 
mines the entropy differences between 
the various phases — information cru- 
cial to a complete understanding of the 
transition. 

When a large amount of a substance 
(such as water) is brought together, it may 
exist in distinct phases (such as solid, 
liquid and gas). At given conditions, 
for example at a particular pressure and 
temperature, the thermodynamically 
stable phase is the one with the lowest 
free energy, which is determined by the 
arrangement, motion and interactions 
of the constituents. A phase diagram 
is a map of the stable phases as a func- 
tion of parameters such as pressure and 
temperature. 

When two phases coexist stably, their 
free energies must be equal, and for a 


water boils at a lower temperature on top of a 
mountain than at sea level). These two coexist- 
ence curves can intersect only at a single value 
of pressure and temperature — a triple point 
(Fig. la). For water, this happens at 0.01 °C and 
612 pascals. This particular triple coexistence 
defines the Kelvin temperature scale’. 

In VO,, the competing phases of inter- 


est are all solids, albeit with different lattice 


single species, this condition leads to a coex- 
istence ‘line’ for the two phases in the phase 
diagram. For example, ice and liquid water 
coexist in equilibrium at 0°C and atmos- 
pheric pressure, and increasing the pressure 
decreases the melting point. Similarly, liquid 
water and water vapour coexist stably at 100°C 
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Figure 1 | Phase transitions with triple points. a, The phase 
diagram of water. At only one pressure and temperature can 
solid, liquid and gaseous water coexist in equilibrium. This 
triple point defines the Kelvin temperature scale. b, Park et al.’ 
have mapped the phase diagram of vanadium dioxide. The 
triple point at zero tensile stress and the slopes of the phase 
boundaries greatly constrain theories that seek to understand 
the transitions from metal (R) to insulator (M1 or M2) in this 
material. Part b is based on Fig. 4b of the paper. 
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structures and electronic properties: a high- 
temperature metallic phase (with a rutile lat- 
tice structure, R), and two insulating phases 
(with monoclinic structures, M1 and M2). The 
competition between these phases is of great 
interest because of the marked change in elec- 
tronic and optical properties that occurs at the 


metal-insulator transition, the proximity 
of the transitions to room temperature, 
and the need to better understand the 
underlying physics. The relative impor- 
tance of electron-electron interactions 
(Mott physics) and lattice distortion (Pei- 
erls physics) in stabilizing the M1 phase 
has been debated for decades. In addition 
to temperature, the intensive quantity rel- 
evant to VO, is the tensile stress rather 
than the pressure. Controlling this stress 
makes measurements in bulk crystals and 
thin films challenging. 

Single-crystal VO, nanobeams’ with a 
well-defined tensile-stress profile along 
the beam have been a boon to those try- 
ing to understand the intrinsic physics 
of this material. Park et al. attached an 
individual single-crystal VO, nanobeam 
to bridge a notched silicon structure, and 
used a piezo actuator to apply a controlled 
longitudinal deformation to the nano- 
beam, and so vary its length. Through 
polarized optical microscopy, Raman 
microscopy and electrical measurements, 
they identified regions of the suspended 
beam in the M1, M2 and R phases. 
Because the entire system was mounted 
on a temperature-controlled stage, the 
authors were able to determine the ten- 
sile stress-temperature phase diagram 
(Fig. 1b) by performing measurements 
of phase composition as a function of 
temperature at fixed length (which they 
can relate to the stress) and as a function 
of length at fixed temperature. To obtain 
measurements at zero stress, they broke 


the nanobeam to ensure that the remaining 
suspended portions, which were no longer in 
contact, were stress free. 

By mapping the phase diagram with high 
precision, the authors extracted interesting 
clues that constrain theoretical treatments of 
the phases in this system. First, it turns out 
that the M1, M2 and R phases can all coexist 
at a triple point at 65.0°C that coincides with 
zero applied stress. There is no obvious reason 
why the M2 phase should become thermody- 
namically stable as soon as the tensile stress 
exceeds zero, as Park et al. observed. This fact, 
long obscured by lack of control over sam- 
ple stresses, is something that a microscopic 
theory should explain. Second, the authors 
determined the ratio of the resistivities of the 
two insulating phases, a parameter that in 
clean crystalline material is related to the den- 
sities of states at the energy of the highest occu- 
pied electronic states and the effective masses 
of the electrons, quantities that should be cal- 
culable using electronic-structure methods. 
Finally, they determined the entropy differ- 
ence per VO, group between the metallic phase 
and each insulating phase at the triple point. 
Although first-principles electronic-structure 
calculations are challenging, particularly when 
trying to understand effects at temperatures 
far above absolute zero, some future compu- 
tational approach should be able to assess the 
relative contributions of electronic and struc- 
tural degrees of freedom to these differences, 
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further illuminating the role of electronic cor- 
relations in the transition(s). 

Many other correlated systems exhibit simi- 
lar phase competitions, including the mangan- 
ites* and the rare-earth nickelates’. The current 
work shows the power of measurements that 
can combine micrometre- or nanometre-scale 
single-crystal materials, control of the stress 
and strain, spatial mapping of phases and in situ 
electronic transport. The importance of mate- 
rial quality and stresses (for example due to lat- 
tice mismatch of a film with a substrate) have 
long been known, and studies of bulk samples 
under applied and chemical pressure have been 
revealing in some correlated systems. With the 
synthesis of novel structures and experimen- 
tal methods such as those described here, the 
prospects are bright for new insights into these 
incredibly rich, complex materials. = 
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Sweet enticements 


to move 


The formation of new blood vessels from pre-existing ones is a carefully 
orchestrated dance. A study reveals that the metabolism of sugar by glycolysis 


contributes to its regulation. 


CHOLSOON JANG & ZOLTAN ARANY 


he breakdown of fuel by metabolism 

is the engine that sustains all cellular 

activities. But can metabolism also steer 
and control cellular processes? Writing in Cell, 
De Bock et al.' suggest that the answer is yes, at 
least in the context of glucose metabolism and 
angiogenesis — the formation of new blood 
vessels*. 

Glycolysis is the cellular process by which 
glucose is converted into pyruvate. A cell 
then makes a choice: it can convert pyruvate 
to lactate, which exits the cell, for a net yield 
of 2 ATP molecules (the currency of cellular 
energy transfer) or, in the presence of oxy- 
gen, the pyruvate can enter cellular organelles 


called mitochondria and become fully oxi- 
dized, producing a net yield of more than 
30 ATP molecules. One would not expect any 
oxygenated cell to opt out of this mitochon- 
drial bonanza, but some do, in a phenomenon 
first noted’ in cancer cells by Otto Warburg in 
1956. Cancer cells probably make this choice 
because intermediate molecules formed dur- 
ing glycolysis support the synthesis of macro- 
molecules needed for cellular replication’. But 
do any non-cancerous or even quiescent cells 
also display the Warburg effect? Endothelial 
cells, which line blood vessels throughout the 
body and mediate angiogenesis, do’, but until 
now little was known about how metabolism 


*This article and the paper under discussion’ were 
published online on 14 August 2013. 
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50 Years Ago 


Outline of Human Genetics. By 
Prof. L. S. Penrose — Throughout, 
Prof. Penrose deals with just those 
points which are of general interest 
and particularly topics about which 
people ignorant of genetics are 
always asking, for example, Is natural 
selection still operating in spite of 
civilization and medical advances? ... 
In “Commentary” he explains 

in more detail how common 
chromosomal abnormalities, such 
as those causing mongolism and 
intersexes, are produced; mentions 
theories dealing with the possibility 
of inherited cancer; touches on 
pharmacogenetics; and outlines the 
vast amount of genetic variability 
which is being shown up by the 
complicated polymorphisms of the 
blood proteins. Finally, he makes 
the very good point that while 
geneticists are continually worrying 
about the quality of the human race 
we shall have doubled our numbers 
in the next 50 years and that birth 
control is far more important than 
the fruitless task of planning the 
superman. 

From Nature 24 August 1963. 


100 Years Ago 


An exhibit illustrating the damage 
caused to biscuits sent out in 
soldered tins for the use of the 
troops in South Africa—especially 
during the Boer war—Gibraltar, 
Malta, Ceylon, &c., has just been 
placed in the central hall of the 
British Museum (Natural History), 
where it will be kept open about 

a month. The larvae of certain 
minute moths and beetles were 
the active agents; and it appears 
that since these cannot, in all 
probability, withstand the high 
temperature to which the biscuits 
are subjected in baking, the eggs 
must be laid by the moths during 
the period when the biscuits are 
being cooled before tinning. 

From Nature 21 August 1913. 
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Figure 1 | Glycolysis regulates angiogenesis. The formation of new blood vessels involves the outward 
movement of endothelial cells from the lining of existing blood vessels, a process that relies on the rapid 
reorganization of actin-protein filaments in cellular structures called filopodia and lamellipodia (not 
shown). The energy for this (in the form of ATP) is provided by the breakdown of glucose, but endothelial 
cells are unusual in that the pyruvate produced by glycolysis is converted to lactate, rather than being 
channelled into mitochondria for further oxidation, as occurs in most cells. De Bock et al.' show that 
both angiogenesis and glycolysis are accelerated by the activity of the enzyme PFK2 in endothelial-cell 
lamellipodia and filopodia. PFK2 converts the glycolytic intermediate fructose-6-phosphate (F-6-P) into 
fructose-2,6-bisphosphate (F-2,6-P,), which, in turn, enhances the activity of the glycolytic enzyme PFK1, 
thereby accelerating glycolysis at these sites. Pyruvate then leaves the cell as lactate, probably because 
filopodia and lamellipodia are too small to accommodate mitochondria. 


affects their function. 

De Bock et al. began their study by con- 
firming a previous report* that endothelial 
cells are highly glycolytic but perform little 
pyruvate oxidation. The authors then asked 
the interesting question: could modulation of 
glycolytic activity have an effect on angiogen- 
esis? To assess this, they altered the amount of 
phosphofructokinase 2 (PFK2) in endothelial 
cells. PFK2 is a glycolysis-regulating enzyme 
that was discovered only in the 1980s, long 
after all key enzymes of the glycolytic path- 
way were thought to be known’. The related 
enzyme PFK1, identified decades before PFK2, 
catalyses the crucial committing step of gly- 
colysis: the conversion of fructose-6-phos- 
phate to fructose-1,6-bisphosphate. PFK2, by 
contrast, converts fructose-6-phosphate to 
fructose-2,6-bisphosphate, which is a potent 
allosteric activator of PFK1 (ref. 6). Activation 
of PFK2 thus drastically accelerates glycolytic 
flux through PFK1. 

De Bock et al. show that reducing PFK2 
levels in endothelial cells not only lowers gly- 
colytic flux, as expected, but also impairs angi- 
ogenesis, by reducing the ability of the cells to 
form tip cells, migrate and form blood-vessel 


410 | NATURE | VOL 500 | 22 AUGUST 2013 


© 2013 Macmillan Publishers Limited. All rights reserved 


‘sprouts. Conversely, and importantly, increas- 
ing PFK2 levels has the opposite effect: 
angiogenesis is increased. The authors also 
show that PFK2 lies downstream of VEGF and 
Notch, two proteins that are dominant deter- 
minants of endothelial-cell characteristics 
during angiogenesis. 

How does PFK2 achieve these effects? Per- 
haps most interestingly, the authors demon- 
strate that PFK2 localizes to structures at the 
margins of endothelial cells called lamellipo- 
dia and filopodia. These cellular projections, 
which contain meshes and filaments of the 
protein actin, mediate endothelial-cell move- 
ment and sprout formation during angiogen- 
esis (Fig. 1). PFK2 activity at this site probably 
coincides with the cellular position of large 
complexes of glycolytic enzymes, known as 
metabolons, which facilitate the channelling 
of metabolic products from one enzyme to the 
next’. Thus, it seems that PFK2 alters angio- 
genic capacity by altering glycolytic flux at the 
site of primary cell motion. 

The study is important for several reasons. 
The findings imply that glucose metabolism 
can ‘steer’ the angiogenic process, in addi- 
tion to simply being its ‘engine’. This unveils 


glucose metabolism as a potential target for 
pro-angiogenic therapies (such as in patients 
with inadequate blood supply to the heart or 
limbs) or anti-angiogenic therapies (for exam- 
ple, to tackle tumours). Metabolic enzymes 
make good drug targets, so this is an exciting 
possibility. The study also provides an addi- 
tional explanation for why endothelial cells 
perform glycolysis rather than oxidative break- 
down of glucose: rapid local generation of ATP 
can occur in glycolytic metabolons located in 
the lamellipodia and filopodia, which are too 
small to accommodate mitochondria and are 
often found at angiogenic fronts where oxygen 
is scant. 

Like all seminal work, this study generates 
several questions. Does modulation of gly- 
colytic flux in ways other than through PFK2 
also affect angiogenic sprouting? Could non- 
enzymatic properties of PFK2 contribute to 
the observed phenomena? Such behaviour 
has been seen for pyruvate kinase, another key 


enzyme in glycolysis that was recently found* 
to be present in the cell nucleus and associ- 
ated with transcription factors that drive gene 
expression. Does PFK2 modulate the activi- 
ties of Rac, Akt and eNOS — key enzymes 
that regulate endothelial-cell motility — and, 
if so, how? How do Notch and VEGF signal to 
PFK2? Does glycolysis regulate migration of 
other cell types, such as smooth-muscle cells or 
macrophages, or even cancer cells? And is the 
pro-angiogenic activity of PFK2 altered when 
glucose homeostasis is perturbed, such as 
in diabetes? 

These questions aside, De Bock and col- 
leagues’ study deepens our understanding of 
why some cells choose to forego the lucrative 
use of mitochondria to break down their glu- 
cose, even when, as is the case for endothelial 
cells, the cells are not highly replicative. The 
authors’ findings also introduce a new concept 
in endothelial biology: that metabolic deci- 
sions can regulate the endothelial phenotype, 


Abundant equals nested 


How ecological network structures are influenced by species coexistence, 
community stability and perturbations is a topic of debate. It seems that one 
overlooked correlate of nested structures is species abundances. SEE LETTER P.449 


COLIN FONTAINE 


nderstanding the mechanisms 
that shape biodiversity is one 
of the main goals of ecology. 
Network approaches, which integrate 
species and the interactions among 
them into a single framework, have 
proved enlightening, revealing distinct 
‘architectural’ patterns that are strongly 
associated with particular ecological 
interactions. For mutualistic networks 
— those in which the interactions bene- 
fit both partners, such as between a plant 
and its pollinator, ora fish and a cleaner 
fish — the pervasive pattern seems to 
be anested one, whereby specialist spe- 
cies (which have few partners) inter- 
act with a subset of the many partners 
of more generalist species. The origin 
and implications of nestedness remain 
strongly debated. On page 449 of this 
issue, Suweis et al.' bring an innova- 
tive and intriguing contribution to this 
topic by demonstrating strong relation- 
ships among species abundances, nested 
architecture and community stability. 
Nestedness is a pattern characterized 
by several features (Fig. 1), including a 
skewed distribution of the number of 
interacting partners per species, with 


Species of group 1 


Species of group 2 


Figure 1| A nested network. The interactions between two 
groups of mutualist species often assume a nested structure, in 
which specialist species (s), which have few partners, interact with 
a subset of the many partners of more generalist species (g). Here, 
the intersection of a row and column is blue if the species interact. 
Nested networks have certain characteristics, such as a continuum 
from highly generalist to specialist species, a core of highly 
connected species (red box) and a tendency for specialist species 
to interact with generalists (for example, the specialist i interacts 
with the generalist a). Suweis et al.’ show that the abundances 

of species in an mutualistic network are positively related to the 
nestedness of the network. 
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as well as vice versa. It turns out that, much like 
children, endothelial cells that gorge on sugar 
become hyperactive. = 
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many specialist species and few extremely 
generalist species. Nestedness also implies 
asymmetric specialization, such that special- 
ist species tend to interact with generalist ones. 
Finally, the generalist species in the nested 
network form a single, highly connected core, 
making the networks very cohesive. 

Three main hypotheses have been proposed 
to explain the biology behind this seemingly 
highly organized structure. One is that nest- 
edness is ‘neutral’, meaning that all 
interactions between individuals are 
equally likely. Species abundances in 
many communities are well described 
by alog-normal distribution, with many 
rare species and a few common ones. 
Under this hypothesis, differences in 
species abundance result in differences 
in interactions at the species level: abun- 
dant species are expected to interact 
more frequently and with more species 
than rare species, and rare species tend 
to interact with abundant species rather 
than with other rare species. However, 
the empirical correlation between spe- 
cies abundances and species general- 
ism is not easy to interpret”. Do species 
become generalists because they are 
more abundant, or are they more abun- 
dant because they are generalists and 
therefore can access more resources? 

The second hypothesis suggests that 
nestedness affects ecological dynamics, 
particularly species coexistence and 
community stability. A simple argu- 
ment supporting this hypothesis is that 
it is much safer for specialist species to 
interact with generalist species than 
with other specialists, because generalist 
species are expected to have less-fluctu- 
ating population dynamics and so to be 
more reliable partners. Such constraints 
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on community persistence or stability could 
therefore be a driving force shaping interac- 
tion networks. However, no consensus on this 
topic has been reached among several investi- 
gations*° in recent years of the links between 
network nestedness and community dynamics 
in mutualistic species. 

According to the third hypothesis, nested 
architecture may be shaped by the (co-)evo- 
lutionary dynamics of species interacting 
within a community. There are many exam- 
ples of interspecies interactions affecting the 
fitness of individuals, and of the evolution of 
species traits controlling the identity of poten- 
tial interaction partners. Closely related spe- 
cies in mutualistic interaction networks tend 
to have similar interacting partners, which 
emphasizes the idea that evolutionary history 
has an impact on the structure of mutualistic 
networks’. But, so far, no precise evolutionary 
process has been directly related to a nested 
structure. 

Suweis et al. have drawn these three hypoth- 
eses together by demonstrating a two-step 
relationship between species abundances in 
a community and the nestedness of the inter- 
action network that depicts that community. 
Using analytical and simulation approaches, 
the authors first show that, under stationary 
conditions that have a constant number and 
strength of mutualistic interactions, ‘inter- 
action swaps’ (an exchange of interactions 
between two species couples) that lead to an 
increase in the abundance of the species also 
increase the total abundance of the commu- 
nity. Second, the researchers demonstrate 
that total community abundance is positively 
related to the nestedness of the network. This 
connection opens up fascinating perspectives. 

To demonstrate the implications of their 
findings, the authors show that, under the con- 
dition that exchanges result in increased spe- 
cies abundance, iterative swapping ultimately 
converts random networks, with randomly 
distributed interactions among species, into 
nested networks. The interpretation of this 
is that any process that maximizes species 
abundance through changes in interspecies 
interactions will lead to a nested network. 
The question thus becomes, what biological 
process could select for higher population 
size? Selection at the population level involves 
group-selection processes such as hard selec- 
tion*’. More work is needed to unravel the 
microevolutionary processes that affect net- 
work architecture, but this line of research 
seems promising. 

Suweis and colleagues further demonstrate 
that the population size of the rarest species 
in the community is positively related to com- 
munity resilience — the speed at which com- 
munity dynamics return to equilibrium after 
a small perturbation. These results fuel the 
current debate about the relationship between 
network architecture and community stabil- 
ity’ °’°"" by introducing the distribution of 


species abundance as a key element. Again, 
however, the processes through which the 
abundance of the rarest species relates to com- 
munity resilience remain to be identified. They 
may involve the rarest species directly, or may 
emerge from other mechanisms affecting both 
the rarest species and community resilience. 

Last but not least, the relationship found by 
Suweis et al. between network nestedness and 
total community abundance goes both ways. 
Abundance is correlated with biomass, which 
is one of the main variables used in studies of 
biodiversity and ecosystem functioning, so 
the two-way relationship provides a bridge 
between the authors’ results and the rich lit- 
erature on these topics. We already know that 
the structure of food webs, for example, can 
affect the relationship between biodiversity 
and ecosystem function”. But little is known 
about the impact of mutualistic networks on 
the functioning of ecological communities. 
Like all exciting pieces of research, Suweis and 
colleagues’ work raises more questions than 
it answers. m 
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A gut feeling for 


isolation 


The far-reaching effects of the relationship between an animal and its resident 
gut microorganisms are becoming ever clearer. New findings suggest it can even 


create barriers that keep species separate. 


GREGORY D. D. HURST & CHRIS D. JIGGINS 


he process of speciation, whereby one 

lineage splits into two independent gene 

pools, has at its heart the evolution of 
barriers to gene flow that maintain differ- 
ences when populations are in contact. Gene 
flow can be reduced in many ways, including 
failure to mate, sperm-—egg incompatibility, 
and sterility or inviability of hybrids. Writing 
in Science, Brucker and Bordenstein! describe 
a novel source of reproductive isolation: the 
influence of resident gut microorganisms on 
hybrid survival. 

The concept of microbial involvement 
in reproductive isolation is not new’. In the 
1990s, it was recognized’ that the very low 
survival rates of hybrid offspring from two 
closely related wasp species was influenced by 
the presence and strain of Wolbachia bacteria 
in the parents. More recently, it was demon- 
strated that environmentally induced changes 
in the composition of the gut microbiota 
could affect mate preference in Drosophila 
fruitflies*. Brucker and Bordenstein’s work 
likewise examined the role of gut microbiota 
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in reproductive isolation, but focused on the 
death of hybrid larvae rather than mate prefer- 
ence, and studied a situation in which the pool 
of environmental microbes was constant. 

Their study organisms were parasitic wasps 
of the genus Nasonia, which lay their eggs in 
the pupae of flesh and filth flies (Fig. 1). The fly 
host represents both a source of nutrition and 
an environmental pool of microbes, and the 
authors had previously established’ that dif- 
ferent Nasonia species acquire distinct com- 
munities of resident gut microorganisms (their 
‘gut microbiomes’) from this common micro- 
bial pool. This differentiation of microbiomes 
was linked to the hosts’ phylogeny: the micro- 
biomes of individuals from the closely related 
species Nasonia giraulti and Nasonia longi- 
cornis were more similar than that of a more 
distantly related species, Nasonia vitripennis. 
Brucker and Bordenstein hypothesized that 
this differentiation creates a setting in which 
dysfunctional interactions could arise between 
hybrids and their gut microbiota. 

To test this idea, the authors examined the 
male offspring formed by crosses between 
N. vitripennis and N. giraulti, most of which 


die during larval develop- 
ment. They noted that dying 
larvae were melanized, a 
characteristic of microbial 
pathology. Furthermore, the 
gut microbiota of the hybrid 
larvae were dominated by 
Proteus mirabilis bacteria, 
in contrast to those of the 
parental species, which are 
dominated by Providencia 
species. By manipulating 
the exposure of the larvae 
to bacteria, the researchers 
established that the presence 
of gut microorganisms was 
necessary for hybrid pathol- 
ogy and death: the hybrid lar- 
vae had near-normal fitness 
when they were reared on a 
bacteria-free diet, but their 
viability declined when Pro- 
teus and Providencia bacteria 
were introduced to the culture 
medium together, and when 
Providencia were introduced 
alone. Hybrid lethality was 
also reinstated when Escheri- 
chia coli bacteria, which are 
not typically found in the 
guts of these wasps, were 
introduced to the bacteria- 
free medium. Intriguingly, 
the authors also found that 
several gene-variant combina- 
tions in the host genome that 
have previously been associ- 
ated with hybrid inviability 
were present in normal inher- 
itance ratios in larvae reared 
on a bacteria-free diet. 
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development and regulation of 
their microbiome. This break- 
down echoes other cases in 
which biotic interactions are 
important in creating hybrid 
inviability. In Heliconius but- 
terflies, for example, related 
species diverge to have dis- 
tinct warning colouration 
patterns that reduce predation 
by birds, but hybrids between 
the species have a pattern that 
is not recognized, resulting 
in increased predation anda 
form of ‘extrinsic’ hybrid invi- 
ability’. 

The work of Brucker and 
Bordenstein provokes several 
questions. For example, how 
many cases of hybrid inviabil- 
ity derive from a misfunction- 
ing interface between the host 
and its microbiota? Can diet- 
ary shifts drive evolutionary 
divergence at the host-micro- 
biome interface, and thereby 
contribute to the evolution of 
reproductive isolation? Per- 
haps most exciting is the idea 
that there may be interactions 
with specific microbiome 
components in hybrids. If this 
were the case, the microbiome 
would expand the network of 
possible interactions within 
an organism, and potentially 
accelerate the rate at which 
incompatibility evolves. 
Our gut feeling is clear: the 
evolutionary biology of the 
intimate and complex inter- 


ROBERT M. BRUCKER 


actions between animals and 
microbes will be a hot topic in 
the years to come. m 


These findings demonstrate 
that hybrid inviability can be 
associated with a perturbed 
host-microbiome interaction. 


Figure 1 | Laid down for life. This scanning electron micrograph shows a Nasonia 
vitripennis wasp laying eggs into the pupa of a Sarcophaga bullata flesh fly. The eggs 
(blue) hatch about 24 hours after being laid, and the larvae (purple) remain under the 
outer casing of the pupa for about nine days, using the fly as a nutrient source. Brucker 


Indeed, there are reasons to 
believe that the involvement 
of symbiotic microorganisms 
in reproductive isolation may be common. 
First, resident microbiota affect both organ- 
ismal development and function’, and this is 
likely to be the case in all species with a gut. 
Second, divergences in animal-microbiome 
interactions between lineages are widely 
observed””, and lineage divergence is the core 


will diverge over time, from both sides, such 
that hybridization creates interface combina- 
tions that may malfunction. 

Brucker and Bordenstein previously 
extended’ the concept of genetic interac- 
tions leading to hybrid dysfunction (called 
Bateson-Dobzhansky- Muller incompat- 


and Bordenstein’ show that bacteria acquired during feeding, which are normally 
symbiotic with the wasp, can kill hybrid wasp larvae. 
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Signatures of mutational processes in 
human cancer 


A list of authors and their affiliations appears at the end of the paper 


All cancers are caused by somatic mutations; however, understanding of the biological processes generating these 
mutations is limited. The catalogue of somatic mutations from a cancer genome bears the signatures of the mutational 
processes that have been operative. Here we analysed 4,938,362 mutations from 7,042 cancers and extracted more than 
20 distinct mutational signatures. Some are present in many cancer types, notably a signature attributed to the APOBEC 
family of cytidine deaminases, whereas others are confined to a single cancer class. Certain signatures are associated 
with age of the patient at cancer diagnosis, known mutagenic exposures or defects in DNA maintenance, but many are of 
cryptic origin. In addition to these genome-wide mutational signatures, hypermutation localized to small genomic 
regions, ‘kataegis’, is found in many cancer types. The results reveal the diversity of mutational processes underlying 
the development of cancer, with potential implications for understanding of cancer aetiology, prevention and therapy. 


Somatic mutations found in cancer genomes’ may be the consequence 
of the intrinsic slight infidelity of the DNA replication machinery, 
exogenous or endogenous mutagen exposures, enzymatic modifica- 
tion of DNA, or defective DNA repair. In some cancer types, a sub- 
stantial proportion of somatic mutations are known to be generated 
by exposures, for example, tobacco smoking in lung cancers and 
ultraviolet light in skin cancers’, or by abnormalities of DNA main- 
tenance, for example, defective DNA mismatch repair in some 
colorectal cancers*. However, our understanding of the mutational 
processes that cause somatic mutations in most cancer classes is 
remarkably limited. 

Different mutational processes often generate different combinations 
of mutation types, termed ‘signatures’. Until recently, mutational sig- 
natures in human cancer have been explored through a small number 


of frequently mutated cancer genes, notably TP53 (ref. 4). Although 
informative, these studies have limitations. To generate a mutational 
signature, a single mutation from each cancer sample is entered into a 
mutation set aggregated from several cases of a particular cancer type. A 
signature that contributes the large majority of somatic mutations in the 
tumour class is accurately reported. However, if multiple mutational 
processes are operative, a jumbled composite signature is generated. 
Furthermore, because such studies are based on ‘driver’ mutations!, 
signatures of selection are superimposed on the signatures of mutational 
processes. 

Recent advances in sequencing technology have overcome past limi- 
tations of scale’. Thousands of somatic mutations can now be iden- 
tified in a single cancer sample, offering the possibility of deciphering 
mutational signatures even when several mutational processes are 
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Figure 1 | The prevalence of somatic mutations across human cancer types. 
Every dot represents a sample whereas the red horizontal lines are the median 


numbers of mutations in the respective cancer types. The vertical axis (log 
scaled) shows the number of mutations per megabase whereas the different 


cancer types are ordered on the horizontal axis based on their median numbers 
of somatic mutations. We thank G. Getz and colleagues for the design of this 
figure*®. ALL, acute lymphoblastic leukaemia; AML, acute myeloid leukaemia; 
CLL, chronic lymphocytic leukaemia. 
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Figure 2 | Validated mutational signatures found in human cancer. Each 
signature is displayed according to the 96 substitution classification defined by 
the substitution class and sequence context immediately 3’ and 5’ to the 
mutated base. The probability bars for the six types of substitutions are 
displayed in different colours. The mutation types are on the horizontal axes, 


operative. Moreover, because most mutations in cancer genomes are 
‘passengers’ they do not bear strong imprints of selection. 

We recently developed an algorithm to extract mutational signa- 
tures from catalogues of somatic mutations and applied it to 21 breast 
cancer whole-genome sequences”®. Novel and known signatures were 
revealed, with the contribution of each signature to each cancer sample 
and the timing of its activity estimated®’. Further studies have demon- 
strated that the approach can also be applied, albeit with less power, to 
mutational catalogues from sequences of all coding exons (exomes)”. 
Global sequencing initiatives are now yielding catalogues of somatic 
mutations from thousands of cancers*. We have therefore applied this 
method to survey the repertoire of mutational signatures and processes 
operating across the spectrum of human neoplasia. 


Mutational catalogues 

We compiled 4,938,362 somatic substitutions and small insertions/ 
deletions (indels) from the mutational catalogues of 7,042 primary 
cancers of 30 different classes (507 from whole genome and 6,535 from 
exome sequences) (Supplementary Fig. 1). In all cases, normal DNA 
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whereas vertical axes depict the percentage of mutations attributed to a specific 
mutation type. All mutational signatures are displayed on the basis of the 
trinucleotide frequency of the human genome. A higher resolution of each 
panel is found respectively in Supplementary Figs 2-23. Asterisk indicates 
mutation type exceeding 20%. 


from the same individuals had been sequenced to establish the somatic 
origin of variants. 

The prevalence of somatic mutations was highly variable between 
and within cancer classes, ranging from about 0.001 per megabase 
(Mb) to more than 400 per Mb (Fig. 1). Certain childhood cancers 
carried fewest mutations whereas cancers related to chronic mutagenic 
exposures such as lung (tobacco smoking) and malignant melanoma 
(exposure to ultraviolet light) exhibited the highest prevalence. This 
variation in mutation prevalence is attributable to differences between 
cancers in the duration of the cellular lineage between the fertilized egg 
and the sequenced cancer cell and/or to differences in somatic muta- 
tion rates during the whole or parts of that cellular lineage’. 


The landscape of mutational signatures 

In principle, all classes of mutation (such as substitutions, indels, rear- 
rangements) and any accessory mutation characteristic, for example, the 
sequence context of the mutation or the transcriptional strand on which 
it occurs, can be incorporated into the set of features by which a muta- 
tional signature is defined. In the first instance, we extracted mutational 
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Figure 3 | The presence of mutational signatures across human cancer 
types. Cancer types are ordered alphabetically as columns whereas mutational 
signatures are displayed as rows. ‘Other’ indicates mutational signatures for 
which we were not able to perform validation or for which validation failed 
(Supplementary Figs 24-28). Prevalence in cancer samples indicates the 


signatures using base substitutions and additionally included informa- 
tion on the sequence context of each mutation. Because there are six 
classes of base substitution—C>A, C>G, C>T, T>A, T>C, T>G (all 
substitutions are referred to by the pyrimidine of the mutated Watson- 
Crick base pair)—and as we incorporated information on the bases 
immediately 5’ and 3’ to each mutated base, there are 96 possible muta- 
tions in this classification. This 96 substitution classification is particu- 
larly useful for distinguishing mutational signatures that cause the same 
substitutions but in different sequence contexts. 

Applying this approach to the 30 cancer types revealed 21 distinct 
validated mutational signatures (Supplementary Table 1 and Sup- 
plementary Figs 2-28). These show substantial diversity (Fig. 2 and 
Supplementary Figs 2-23). There are signatures characterized by 
prominence of only one or two of the 96 possible substitution muta- 
tions, indicating remarkable specificity of mutation type and sequence 
context (signature 10). By contrast, others exhibit a more-or-less equal 
representation of all 96 mutations (signature 3). There are signatures 
characterized predominantly by C>T (signatures 1A/B, 6, 7, 11, 15, 
19), C>A (4, 8, 18), T>C (5, 12, 16, 21) and T>G mutations (9, 17), 
with others showing distinctive combinations of mutation classes 
(2, 13, 14). 

Signatures 1A and 1B were observed in 25 out of 30 cancer classes 
(Fig. 3). Both are characterized by prominence of C>T substitutions 
at NpCpG trinucleotides. Because they are almost mutually exclusive 
among tumour types they probably represent the same underlying 
process, with signature 1B representing less efficient separation from 
other signatures in some cancer types. Signature 1A/B is probably 
related to the relatively elevated rate of spontaneous deamination 
of 5-methyl-cytosine which results in C>T transitions and which 
predominantly occurs at NpCpG trinucleotides’. This mutational 
process operates in the germ line, where it has resulted in substantial 
depletion of NpCpG sequences, and in normal somatic cells’. 

Signature 2 is characterized primarily by C>T and C>G mutations 
at TpCpN trinucleotides and was found in 16 out of 30 cancer types 


percentage of samples from our data set of 7,042 cancers in which the signature 
contributed significant number of somatic mutations. For most signatures, 
significant number of mutations in a sample is defined as more than 100 
substitutions or more than 25% of all mutations in that sample. MMR, 
mismatch repair. 


(Fig. 3). On the basis of similarities in mutation type and sequence 
context we previously proposed that signature 2 is due to over activity 
of members of the APOBEC family of cytidine deaminases, which 
convert cytidine to uracil, coupled to activity of the base excision 
repair and DNA replication machineries®"'. 

In most cancer classes at least two mutational signatures were 
observed, with a maximum of six in cancers of the liver, uterus and 
stomach. Although these differences may, in part, be attributable to 
differences in the power to extract signatures, it seems likely that some 
cancers have a more complex repertoire of mutational processes than 
others. 

Most individual cancer genomes exhibit more than one mutational 
signature and many different combinations of signatures were observed 
(Fig. 4 and Supplementary Figs 29-88). The patterns of contribution to 
individual cancer samples vary markedly between signatures. Signature 
1A/B contributes relatively similar numbers of mutations to most cancer 
cases whereas other signatures contribute overwhelming numbers of 
mutations to some cancer samples but very few to others of the same 
cancer class, for example, signatures 2, 3, 4, 6, 7, 9, 10, 11, 13 (Fig. 4). 


Mutational signatures and age of cancer diagnosis 

We examined each cancer type for correlations between age of dia- 
gnosis and the number of mutations attributable to each signature in 
each sample. Signature 1A/B exhibited strong positive correlations 
with age in the majority of cancer types of childhood and adulthood 
(Supplementary Table 2). No other mutational signature showed a 
consistent correlation with age of diagnosis. 

The mutations in a cancer genome may be acquired at any stage in 
the cellular lineage from the fertilized egg to the sequenced cancer cell. 
The correlation with age of diagnosis is consistent with the hypothesis 
that a substantial proportion of signature 1A/B mutations in cancer 
genomes have been acquired over the lifetime of the cancer patient, at 
a relatively constant rate that is similar in different people, probably in 
normal somatic tissues. The absence of consistent correlation of all 
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Figure 4 | The contributions of mutational signatures to individual cancers 
of selected cancer types. Each bar represents a typical selected sample from the 
respective cancer type and the vertical axis denotes the number of mutations 
per megabase. Contributions across all cancer samples could be found in 
Supplementary Figs 29-58. Summary of the total contributions for all operative 
mutational processes in a cancer type can be found in Supplementary Figs 59- 
88. ‘Other’ indicates mutational signatures for which we were not able to 
perform validation or for which validation failed (Supplementary Figs 24-28). 


other signatures with age suggests that mutations associated with these 
have been generated at different rates in different people, possibly as a 
consequence of differing carcinogen exposures or after neoplastic 
change has been initiated. 


Mutational signatures with transcriptional strand bias 


The efficiency of DNA damage and DNA maintenance processes can 
differ between the transcribed and untranscribed strands of genes. The 
most well known cause of this phenomenon is transcription-coupled 
nucleotide excision repair (NER) that operates predominantly on the 
transcribed strand of genes and is recruited by RNA polymerase II 
when it encounters bulky DNA helix-distorting lesions’”. 

We re-extracted substitution mutational signatures incorporating 
the transcriptional strand on which each mutation has taken place. 
Because a mutation in a transcribed genomic region may be either on 
the transcribed or the untranscribed strand, this generates a classifica- 
tion with 192 mutation subclasses. 

Several signatures showed substantial differences in mutation pre- 
valence between transcribed and untranscribed strands (known as 
transcriptional strand bias) (Fig. 5 and Supplementary Figs 89-95). 
For example, signature 4 shows transcriptional strand bias for C>A 
mutations (Fig. 5). Signature 4 is observed in lung adeno, squamous 
and small cell carcinomas, head and neck squamous, and liver cancers 
(Fig. 3), most of which are known to be caused by tobacco smoking. 
Therefore, signature 4 is probably an imprint of the bulky DNA adducts 
generated by polycyclic hydrocarbons found in tobacco smoke and 
their removal by transcription-coupled NER”. The higher prevalence 
of C>A mutations on transcribed compared to untranscribed strands is 
consistent with the propensity of many tobacco carcinogens to form 
adducts on guanine. 

Similarly, signature 7, mainly found in malignant melanoma, shows 
a higher prevalence of C>T mutations on the untranscribed compared 
to the transcribed strands consistent with the formation, through ultra- 
violet exposure, of pyrimidine dimers and other lesions which are known 
to be repaired by transcription-coupled NER™. 

Beyond these known examples of DNA damage processed by 
transcription-coupled NER, other signatures show strong transcrip- 
tional strand bias (5, 8, 10, 12, 16). Notably, signature 16, which is 
characterized by T>C mutations at ApTpA, ApIpG and ApTpT 
trinucleotides and is observed in hepatocellular carcinomas, shows 
the strongest transcriptional strand bias of any signature, with T>C 
mutations occurring almost exclusively on the transcribed strand 
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Figure 5 | Selected mutational signatures with strong transcriptional strand 
bias. Mutations are shown according to the 192 mutation classification 
incorporating the substitution type, the sequence context immediately 5’ and 3’ 
to the mutated base and whether the mutated pyrimidine is on the transcribed 
or untranscribed strand. The mutation types are displayed on the horizontal 
axis, whereas the vertical axis depicts the percentage of mutations attributed toa 
specific mutation type. A higher resolution version of all mutational signatures 
with strong transcriptional strand bias is found respectively in Supplementary 
Figs 89-95. 


(Fig. 5). Similarly, signature 12, which features T>C mutations at 
NpTpN trinucleotides, also found in hepatocellular carcinomas, 
shows strong transcriptional strand bias with more T>C mutations 
on the transcribed than untranscribed strands (Supplementary Fig. 94). 
On the assumption that the transcriptional strand biases in signa- 
tures 12 and 16 are introduced by transcription-coupled NER, these 
currently unexplained signatures may be the result of bulky DNA 
helix-distorting adducts on adenine. However, there is no previous 
basis for invoking transcription-coupled NER in the genesis of these 
signatures and other causes of transcriptional strand bias may exist. 


Mutational signatures with insertions and deletions 


We re-extracted the mutational signatures including, in addition to 
the 96 substitution types, two further classes of mutation: indels at 
short nucleotide repeats and indels with overlapping microhomology 
at breakpoint junctions. Three of the 21 base substitution signatures 
associated with large numbers of indels. Signature 6, which is char- 
acterized predominantly by C>T at NpCpG mutations, but is distinct 
from signature 1A/B, contributes very large numbers of substitutions 
and small indels (mostly of 1 bp) at nucleotide repeats to subsets of 
colorectal, uterine, liver, kidney, prostate, oesophageal and pancreatic 
cancers. This pattern of indels, often termed ‘microsatellite instability’, 
is characteristic of cancers with defective DNA mismatch repair’*. Con- 
sistent with this explanation, the presence of signature 6 was strongly 
associated with the inactivation of DNA mismatch repair genes in 
colorectal cancer (P = 3.3 X 10°). 

Signature 15 also contributes very large numbers of substitutions 
and small indels at nucleotide repeats but, compared to signature 6, 
exhibits greater prominence of C>T at GpCpN trinucleotides. 
Signature 15 was found in several samples of lung and stomach cancer 
and its origin is currently unknown. 

By contrast, substantial numbers of larger deletions (up to 50 bp) 
with overlapping microhomology at breakpoint junctions were found 
in breast, ovarian and pancreatic cancer cases with major contribu- 
tions from signature 3. A subset of cancer cases of these three classes is 
known to be due to inactivating mutations in BRCA1 and BRCA2, and 
the presence of signature 3 was strongly associated with BRCAI and 
BRCA2 mutations within the individual cancer types (P = 1.6 X 10 * 
for breast cancer and P = 0.02 for pancreatic cancer)*. Indeed, almost 
all cases with BRCAI and BRCA2 mutations showed a large contri- 
bution from signature 3. However, some cases with a substantial con- 
tribution from signature 3 did not have BRCA1 and BRCA2 mutations, 
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indicating that other mechanisms of BRCA1 and BRCA2 inactivation 
or abnormalities of other genes may also generate it. 

BRCAI and BRCA2 are implicated in homologous-recombination- 
based DNA double-strand break repair’®. Abrogation of their functions 
results in non-homologous end-joining mechanisms, which can use 
microhomology at rearrangement junctions to rejoin double-strand 
breaks, taking over DNA double-strand break repair. The results show 
that, in addition to the genomic structural instability conferred by 
defective double-strand break repair, a base substitution mutational 
signature is associated with BRCA1 and BRCA2 deficiency. 


Associating cancer aetiology and mutational signatures 


Each mutational signature is the imprint left on the cancer genome 
by a mutational process that may include one or more DNA damage 
and/or DNA maintenance mechanisms, with the latter either func- 
tioning normally or abnormally. Here we consider likely mechanisms 
or underlying causes by comparing signatures with mutation patterns 
of known causation in the scientific literature or by associating them 
with epidemiological and biological features of particular cancer types. 

Signature 1 A/B is probably due to the endogenous mutational process 
present in most normal and neoplastic cells that is initiated by deamina- 
tion of 5-methyl-cytosine’. Other signatures are probably attributable to 
exogenous mutagenic exposures. Signature 7 is observed in malignant 
melanoma and squamous carcinoma of the head and neck and has the 
known features of ultraviolet-light-induced mutations. Signature 4 is 
found in cancers associated with tobacco smoking (Fig. 3) and has the 
mutational features associated with tobacco carcinogens’*. The causal 
relationship between tobacco smoking and signature 4 is supported by a 
strong positive association between smoking history and the contribu- 
tions of signature 4 to individual cancers (P = 1.1 X 10°”, Supplemen- 
tary Figs 44-46, 74-76 and 96). 

Cigarette smoke contains over 60 carcinogens’ and it is possible 
that this complex mixture may initiate other mutational processes. 
Signatures 1A/B, 2 and 5 were also found in lung adenocarcinoma. 
Signature 5, but not signatures 1A/B and 2, also showed a positive 
correlation between smoking history and mutation contribution 
(P = 8.0 X 10 °, Supplementary Fig. 96). Thus, in lung cancer, sig- 
nature 5, which is characterized predominantly by C>T and T>C 
mutations, may also be due to tobacco carcinogens. However, it is also 
present in nine other cancer types, most of which are not strongly 
associated with tobacco consumption, and therefore its aetiology 
overall is unclear (Fig. 3). 

Some anticancer drugs are mutagens'’. Signature 11 is found in 
malignant melanomas and glioblastoma multiforme pretreated with 
the alkylating agent temozolomide (P = 4.0 X 10 *) and has muta- 
tional features very similar to those previously reported in experimental 
studies of alkylating agents’®. 

Abnormalities in DNA maintenance may also be responsible for 
mutational signatures, and the roles of defective DNA mismatch repair 
(signature 6) and defective homologous-recombination-based DNA 
double-strand break repair (signature 3) have been discussed above. 
Other signatures may result from abnormal activity of enzymes that 
modify DNA or of error-prone polymerases. Signatures 2 and 13 have 
been attributed to the AID/APOBEC family of cytidine deaminases’. 
On the basis of similarities in the sequence context of cytosine muta- 
tions caused by APOBEC enzymes in experimental systems, a role for 
APOBECI1, APOBEC3A and/or APOBEC3B in human cancer seems 
more likely than for other members of the family'”*'. However, the 
reason for the extreme activation of this mutational process in some 
cancers is unknown. Because APOBEC activation constitutes part of 
the innate immune response to viruses and retrotransposons” it may 
be that these mutational signatures represent collateral damage on the 
human genome from a response originally directed at retrotransposing 
DNA elements or exogenous viruses. Confirmation of this hypothesis 
would establish an important new mechanism for initiation of human 
carcinogenesis. 
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Signature 9, observed in chronic lymphocytic leukaemia and malig- 
nant B-cell lymphomas, is characterized by T>G transversions at 
ApTpN and TpTpN trinucleotides, and is restricted to cancers that have 
undergone somatic immunoglobulin gene hypermutation (IGHV- 
mutated) associated with AID (P = 2.5 X10 * in chronic lymphoid 
leukaemia (CLL)). Signature 9 does not, however, have the known muta- 
tional features of AID”’, and has been proposed to be due to polymerase 
7, an error-prone polymerase involved in processing AID-induced 
cytidine deamination'’”’. Similarly, signature 10, which generates huge 
numbers of mutations in subsets of colorectal and uterine cancer, has 
been previously associated with altered activity of the error-prone poly- 
merase Pol € consequent on mutations in the gene**”’. 

Many mutational signatures do not, however, have an established or 
proposed underlying mutational process or aetiology. Some, for example 
signatures 8, 12 and 16, show strong transcriptional strand bias (Fig. 5) 
and possibly reflect the involvement of transcription-coupled nucleotide 
excision repair acting on bulky DNA adducts due to exogenous carcino- 
gens. Others, for example signatures 14, 15 and 21, show overwhelming 
activity in a small number of cancer cases (Supplementary Figs 38, 45 
and 56, respectively) and are perhaps more likely to be due to currently 
uncharacterized defects in DNA maintenance. 


Localized hypermutation 


Foci of localized substitution hypermutation, termed kataegis after 
the Greek for thunderstorm, were recently described in breast cancer®. 
Kataegis is characterized by clusters of C>T and/or C>G mutations 
which are substantially enriched at TpCpN trinucleotides and on the 
same DNA strand. Foci of kataegis include from a few to several 
thousand mutations and are often found in the vicinity of geno- 
mic rearrangements. The genomic regions affected are different in 
different cancers. On the basis of the substitution types and sequence 
context of kataegis substitutions, an underlying role for APOBEC 
family enzymes was proposed for kataegis as well as for signatures 2 
and 13 (ref. 6). 

The 507 whole-cancer genome mutation catalogues were searched 
for clusters of mutations. Cancers of breast (67 of 119), pancreas (11 of 
15), lung (20 of 24), liver (15 of 88), medulloblastomas (2 of 100), CLL 
(15 of 28), B-cell lymphomas (21 of 24) and acute lymphoblastic 
leukaemia (1 of 1) showed occasional (<10), small (<20 mutations) 
foci of kataegis, whereas acute myeloid leukaemia (0 of 7) and pilo- 
cytic astrocytoma (0 of 101) did not. Subsets of breast (7), lung (6) and 
haematological cancers (3) showed numerous (> 10) kataegic foci and 
two breast and one pancreatic cancer showed major foci of kataegis 
(>50 mutations) (Fig. 6 and Supplementary Figs 97 and 98). 

Kataegic foci are often associated with genomic rearrangements 
(Supplementary Fig. 98). In yeast, introduction of a DNA double- 
strand break greatly increases the likelihood of kataegis in its vicinity, 
indicating a role for such breaks in initiating the process”. However, 
even in cancer cases with kataegis, most rearrangements do not exhibit 
nearby kataegis, indicating that a double-strand break is not sufficient. 

In neoplasms of B-lymphocyte origin, including CLL and many 
lymphomas, mutation clusters recurrently occurred at immunoglo- 
bulin loci. In these cancers the mutation characteristics were different 
(Supplementary Fig. 98), bearing the hallmarks of somatic hypermutation 
associated with AID, which is operative during the generation of 
immunological diversity”. 


Discussion 


The diversity and complexity of somatic mutational processes under- 
lying carcinogenesis in human beings is now being revealed through 
mutational patterns buried within cancer genomes. It is likely that more 
mutational signatures will be extracted, together with more precise 
definition of their features, as the number of whole-genome sequenced 
cancers increases and analytical methods are further refined. 

The mechanistic basis of some signatures is, at least partially, under- 
stood but for many it remains speculative or unknown. Elucidating the 
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Figure 6 | Kataegis in three cancers. Each of these ‘rainfall’ plots represents an 
individual cancer sample in which each dot represents a single somatic 

mutation ordered on the horizontal axis according to its position in the human 
genome. The vertical axis denotes the genomic distance of each mutation from 
the previous mutation. Arrowheads indicate clusters of mutations in kataegis. 


underlying mutational processes will depend upon two major streams 
of investigation. First, compilation of mutational signatures from 
model systems exposed to known mutagens or perturbations of the 
DNA maintenance machinery and comparison with those found in 
human cancers. Second, correlation of the contributions of mutational 
signatures with other biological characteristics of each cancer through 
diverse approaches ranging from molecular profiling to epidemiology. 
Collectively, these studies will advance our understanding of cancer 
aetiology with potential implications for prevention and treatment. 


METHODS SUMMARY 


Mutational catalogues were stringently filtered and our previously developed 
computational framework®® was used to extract mutational signatures from 
them. The computational framework for deciphering mutational signatures 
and all mutational catalogues are freely available for download from http:// 
www.mathworks.com/matlabcentral/fileexchange/38724, whereas the complete 
set of somatic mutations is available from ftp://ftp.sanger.ac.uk/pub/cancer/ 
AlexandrovEtAl. All presented mutational signatures were validated. Kataegis 
was detected using an algorithm based on piecewise constant fitting. 


Full Methods and any associated references are available in the online version of 
the paper. 
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METHODS 


Validating mutational signatures. Validating a mutational signature requires 
ensuring that a large set of somatic mutations attributed to this signature is genuine 
in at least one sample. Validation is complicated as multiple mutational processes 
are usually operative in most cancer samples, and thus every individual somatic 
mutation can be probabilistically assigned to several mutational signatures. To 
overcome this limitation, we examined our data set for samples that are predomi- 
nantly generated by one mutational signature (that is, more than 50% of the 
somatic mutations in the sample belong to an individual mutational signature) 
and/or for samples in which all operative mutational processes have mutually 
exclusive patterns of mutations (for example, a sample with mutations only from 
signature 1B, which is predominantly C>T substitutions, and signature 18, which 
is predominantly C>A substitutions). We identified the optimal available sample 
for every mutational signature and attempted to validate the subset of somatic 
mutations attributed to this signature using one of three methods (Supplementary 
Fig. 99): (1) validation through re-sequencing with an orthogonal sequencing 
technology; (2) validation through re-sequencing with the same sequencing tech- 
nology (including RNA-seq, bisulphite sequencing, etc.); (3) validation through 
visual examination of somatic mutations by an experienced curator using a geno- 
mic browser and BAM files for both the tumour and its matched normal. 

For some of the previously published samples, we used the already reported 
validation data. When possible, somatic mutations were validated by either re- 
sequencing with orthogonal technology or re-sequencing using the same sequencing 
technology. We resorted to visual validation only when there was no other possibility 
for validating a mutational signature. 22 out of the 27 originally identified mutational 
signatures were validated (Supplementary Table 1 and Supplementary Fig. 99). 
Three mutational signatures failed validation: signatures R1 to R3 (Supplementary 
Figs 24 to 26). We were unable to validate two mutational signatures: signatures U1 
and U2 (Supplementary Figs 27 and 28), due to lack of available biological samples 
and access to BAM files for the samples with sufficient number of somatic mutations 
generated by these two mutational signatures. 

Samples and curation of freely available cancer data. Informed consent was 
obtained from all subjects. Collection and use of patient samples were approved 
by the appropriate Internal Review Board of each institution. In addition to newly 
generated data, we curated freely available somatic mutations from three other 
sources: (1) the data portal of The Cancer Genome Atlas (TCGA); (2) the data 
portal of the International Cancer Genome Consortium (ICGC); (3) previously 
published data in peer-review journals, see additional references®”*””-’. 
Filtering, estimating mutation prevalence and generating mutational catalogues. 
Inall examined samples, normal DNA from the same individuals had been sequenced 
to establish the somatic origin of variants. Extensive filtering was performed to 
remove any residual germline mutations and technology-specific sequencing arte- 
facts before analysing the data. Germline mutations were filtered out from the lists 
of reported mutations using the complete list of germline mutations from dbSNP®, 
1000 genomes project*', NHLBI GO Exome Sequencing Project™, and 69 Complete 
Genomics panel (http://www.completegenomics.com/public-data/69-Genomes/). 
Technology-specific sequencing artefacts were filtered out by using panels of BAM 
files of (unmatched) normal tissues containing more than 120 normal genomes and 
500 normal exomes. Any somatic mutation present in at least three well-mapping 
reads in at least two normal BAM files was discarded. The remaining somatic 
mutations were used for generating a mutational catalogue for every sample. 

Prevalence of somatic mutations was estimated on the basis of a haploid human 
genome after all filtering. Prevalence of somatic mutations in exomes was calcu- 
lated based on the identified mutations in protein-coding genes and assuming that 
an average exome has 30 Mb in protein-coding genes with sufficient coverage. 
Prevalence of somatic mutations in whole genomes was calculated based on all 
identified mutations and assuming that an average whole genome has 2.8 gigabases 
with sufficient coverage. 

The immediate 5’ and 3’ sequence context was extracted using the ENSEMBL 
Core programing interfaces for human genome build GRCh37. Curated somatic 
mutations that originally mapped to an older version of the human genome were re- 
mapped using UCSC’s freely available lift genome annotations tool (any somatic 
mutations with ambiguous or missing mappings were discarded). Dinucleotide 
substitutions were identified when two substitutions were present in consecutive 
bases on the same chromosome (sequence context was ignored). The immediate 5 
and 3’ sequence content of all indels was examined and the ones present at mono/ 
polynucleotide repeats or microhomologies were included in the analysed muta- 
tional catalogues as their respective types. Strand bias catalogues were derived for 
each sample using only substitutions identified in the transcribed regions of well- 
annotated protein-coding genes. Genomic regions of bidirectional transcription 
were excluded from the strand bias analysis. 

Deciphering signatures of mutational processes. Mutational signatures were 
deciphered independently for each of the 30 cancer types using our previously 


developed computational framework’. The algorithm deciphers the minimal set 
of mutational signatures that optimally explains the proportion of each mutation 
type found in each catalogue and then estimates the contribution of each sig- 
nature to each catalogue. Mutational signatures were also extracted separately for 
genomes and exomes. Mutational signatures extracted from exomes were nor- 
malized using the observed trinucleotide frequency in the human exome to the 
one of the human genome. All mutational signatures were clustered using unsu- 
pervised agglomerative hierarchical clustering and a threshold was selected to 
identify the set of consensus mutational signatures. Mis-clustering was avoided by 
manual examination (and whenever necessary re-assignment) of all signatures in 
all clusters. 27 consensus mutational signatures were identified across the 30 
cancer types. The computational framework for deciphering mutational signa- 
tures as well as the data used in this study are freely available and can be down- 
loaded from _http://www.mathworks.com/matlabcentral/fileexchange/38724, 
whereas the complete set of somatic mutations is available from ftp://ftp.sanger. 
ac.uk/pub/cancer/AlexandrovEtAl. 

Factors that influence extraction of mutational signatures. Recently, using 
simulated and real data, we described in detail the factors that influence the 
extraction of mutational signatures’. These included the number of available 
samples, the mutation prevalence in samples, the number of mutations contri- 
buted by different mutational signatures, the similarity between the signatures of 
mutational processes operative in cancer samples, as well as the limitations of our 
computational approach. Here, we examined data sets with varying sizes from 30 
different cancer types and we have taken great care to report only validated 
mutational signatures. However, our approach identified two similar patterns 
most likely representing the same biological process; that is, signature 1A and 
1B. The reasons for this is, for some cancer types we have sufficient numbers of 
samples and/or mutations (that is, statistical power) to decipher the cleaner 
version (that is, signature 1A), whereas for other cancer types we do not have 
sufficient data and our approach extracts a version of the signature which is more 
contaminated by other signatures present in that cancer type (that is, signature 
1B). Nevertheless, the two signatures are very similar; hence we call them 1A and 
1B. Being almost mutually exclusive among cancer types (that is, finding either 
signature 1A or 1B in each cancer type but not usually both) is supportive of the 
notion that they represent the same underlying process as is the fact that signa- 
tures 1A and 1B both correlate with age and have the same overall pattern of 
contributions to individual cancer genomes. Indeed, in our view it is likely that if 
we had sufficient data, signature 1B would disappear and the algorithm would 
extract only signature 1A. 

Displaying mutational signatures. Mutational signatures are displayed using a 
96 substitution classification defined by the substitution class and the sequence 
context immediately 3’ and 5’ to the mutated base. Mutational signatures are 
displayed in the main text of the report and in Supplementary Information on the 
basis of the observed trinucleotide frequency of the human genome; that is, 
representing the relative proportions of mutations generated in each signature 
based on the actual trinucleotide frequencies of the reference human genome. 
However, in Supplementary Information we also provide a visualization of muta- 
tional signatures based on an equal frequency of each trinucleotide (Supplemen- 
tary Figs 2-28). The equal trinucleotide frequency representation results, in all 
mutational signatures, in a greater degree of prominence of C>T substitutions at 
NpCpG trinucleotides as major features compared to the plots based on the 
observed trinucleotides. This difference may in some cases reflect the biological 
reality, that is, a propensity of the particular mutational process to be more active 
at NpCpG trinucleotides. However, note that it may also in some cases be due to 
incomplete extraction by the algorithm of the signature in question from sig- 
nature 1A/B, which is characterized by prominent features at NpCpG trinucleo- 
tides. This is likely to happen because (1) signature 1A/B is ubiquitous and (2) 
because even a small probability of mutations at NpCpG trinucleotides will 
generate a prominent feature because of the severe depletion of NpCpG trinu- 
cleotides in the reference genome. In future, with larger numbers of sequences 
and large numbers of whole-genome sequences it is anticipated that the latter 
effect will be reduced. 

Approaches for associating cancer aetiology and exposures of validated muta- 
tional signatures. Generalized linear models (GLMs) were used to fit signature 
exposures (that is, number of mutations assigned to a signature) and age of cancer 
diagnoses. For each cancer type, all mutational signatures operative in it were 
evaluated using GLMs and the P values were corrected for multiple hypothesis 
testing using the Benjamini-Hochberg false discovery rate procedure. The result- 
ing P values indicate that age strongly correlates with signature 1A/B across 15 
cancer types (Supplementary Table 2). Exposure to signature 4 also correlates 
with age of diagnosis in kidney papillary and thyroid cancers. However, in both 
cancer types, we were not able to detect/extract signature 1A/B due to a low 
number of mutations in their samples and it is likely that signature 1A/B is 
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currently mixed within signature 4. Further studies involving whole-genome 
sequences will be needed to validate this hypothesis. Notably, in melanoma, age 
of diagnosis also correlates with exposure to signature 7, which we have associated 
with exposure to ultraviolet light. 

Associations between all other aetiologies and signature exposures were per- 

formed using two-sample Kolmogorov-Smirnov tests between two sets of sam- 
ples. The first set contains the signature exposures of the samples with the ‘desired 
feature’ (for example, samples that contain a hypermutation in the immuno- 
globulin gene) and the second set is the signature exposures of the samples without 
the ‘desired feature’ (for example, samples that do not contain a hypermutation in 
the immunoglobulin gene). Samples with unknown feature status (for example, 
not knowing the status of the immunoglobulin gene) were ignored. Kolmogorov- 
Smirnov tests were performed for all signatures and all examined ‘features’ in a 
cancer type. P values were corrected for multiple hypothesis testing using the 
Benjamini-Hochberg false discovery rate procedure and based on the performed 
tests in a particular cancer class. 
A piecewise-constant-fitting-based algorithm for the detection of kataegis. 
Foci of localized hypermutation, termed kataegis, were sought in 507 whole- 
genome sequenced cancers. High-quality variant calls that had been previously 
subjected to filtering for mutational signature analysis were investigated using an 
algorithm developed to identify foci of kataegis. 

For each sample, all mutations were ordered by chromosomal position and the 
intermutation distance, defined as the number of base pairs from each mutation 
to the next one, was calculated. Intermutation distances were then segmented 
using the piecewise constant fitting (PCF) method” to find regions of constant 
intermutation distance. Parameters used for PCF were y = 25 and ki, = 2 and 
were trained on the set of kataegis foci that had been manually identified, curated 
and validated using orthogonal sequencing platforms’. Putative regions of katae- 
gis were identified as those segments containing six or more consecutive muta- 
tions with an average intermutation distance of less than or equal to 1,000 bp. 
Variation in number of foci of kataegis and relationship with genome-wide 
mutation burden. To examine the likelihood of kataegis occurring for different 
mutation burdens, the expected number of kataegis events that would be observed 
by chance was calculated for a range of total number of mutations per cancer, n, 
between 1,000 and 2,000,000. The probability that any one mutation will be 
followed by five other mutations within a distance of 5,000 bp, thereby triggering 
the identification of kataegis, is given by p = P(Pois(5,000n/g) = 5), where gis the 
length of the genome, in base pairs. 

Supplementary Fig. 97 shows the expected number of kataegis events identified 
in genomes with between 100,000 and 500,000 mutations. For cancers with up to 
200,000 mutations, the expected number of kataegis events is extremely small 
(0.16 for a total mutation load of 200,000), making the detection of kataegic foci 
highly significant for each sample. Supplementary Table 3 presents all the samples 
in which kataegic foci were identified, the total mutation burden for each sample, 
the observed number of kataegic foci, and the expected number of foci. 
Specificity of variants in kataegis foci. Clusters of variant calls can easily occur in 
regions of low sequence complexity. These are not true substitution mutations but 
represent systematic sequencing artefacts or mis-mapping of short reads. The 
quality of variant calls depends on the quality of mutation-calling by individual 
institutions. Additional filtering was applied to remove likely false-positive calls 
and then putative kataegic foci were individually curated. 

1,436 kataegis foci were called by PCF, with 873 finalized as putative kataegis 
foci (Supplementary Table 4) involving 9,219 substitution variants. Where pos- 
sible, BAM files were retrieved, inspected and substitution variants involved in 
kataegis foci were manually curated to remove likely false-positive calls. Where 
BAM files were not available to us, substitution variants were strictly excluded if 
called in: (1) genomic features that generate mapping errors, for example, regions 
of excessively high coverage due to collapsed repeat sequences in the reference 
genome”; (2) highly repetitive regions with reads consistently demonstrating low 
mapping qualities in 20 unrelated normal samples; (3) locations with known germ- 
line insertions/deletions within the sequencing reads reporting the mutated base. 

Several features were seen in the finalized putative kataegis foci, which rein- 
forced the conviction in the validity of these calls. Although clusters of mutations 
identified by the PCF method were sought in an approach unbiased by mutation 
type and based exclusively on intermutation distances, we find that the 873 
putative foci demonstrate: first, a preponderance to C>T and C>G mutations 
(Supplementary Fig. 97b); second, the enrichment for a TpC sequence context as 
previously described® (Supplementary Fig. 97b); third, processivity (where con- 
secutive mutations within a cluster were on the same strand; that is, 6 C>T 
mutations in a row or 6 G>A mutations in a row; Fig. 6c); and fourth, visual 
curation of reads carrying these processive variants showed that the variants were 
usually in cis (that is, mutations were on the same read (Supplementary Fig. 97c) 
or on the read mate of other affected alleles within the insert size) with respect to 
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each other, indicating that they had arisen on the same allele. Finally, where data 
were available, we found that clusters of substitution mutations within the same 
kataegis foci shared approximately the same variant allele fraction, indicating that 
they had probably arisen during a single cell cycle event. 

BAM files from some samples were not accessible and therefore a proportion of 
substitution variants involved in kataegis foci were not visually curated. The 
application of the strict criteria described above and the subsequent finding of 
the consistency of the mutation-type, sequence context, processive nature of the 
mutations, with the majority in cis on individual sequencing reads, indicates that 
the vast majority of these foci are probably genuine. However, the possibility that 
some of the foci are not truly kataegis, particularly for the cancers which have not 
been validated or visually curated, remains. 

Sensitivity of kataegis detection. It is acknowledged that the likelihood of detec- 
tion of kataegis foci rests on the sensitivity of mutation detection. It is possible for 
foci to be missed because the mutations were not detected by mutation callers of 
the various institutions, before our analysis. This is particularly relevant for sub- 
clonal mutations bearing a low variant allele fraction or for mutations that occur 
on a single copy of a multi-copy locus. This is because the likelihood of mutation 
detection is reduced when uncorrected for copy number and for aberrant cell 
fraction of the tumour sample. Furthermore, our stringent post-processing cri- 
teria, particularly of samples that have not been visually curated, make it more 
likely that kataegis is under-represented in this analysis. 

Relationship between kataegis and large-scale genomic changes. Reinforcing 
our previous findings®, we found that some kataegic foci were very closely assoc- 
iated with rearrangements. For example, a breast cancer sample with 1,534 point 
mutations had only one focus of kataegis which contained 32 point mutations. 
The same breast cancer sample also had 25 large-scale genomic structural varia- 
tions scattered throughout the genome. However, one tandem duplication coin- 
cided with this single locus of kataegis in this cancer. Notably, no other mutations 
or structural variations were seen for 2 Mb flanking this extraordinary event (Sup- 
plementary Fig. 97b). Another breast cancer (Fig. 6) that contained 22,454 muta- 
tions and had 292 rearrangements altogether, had nine regions of kataegis, 
five of which coincided with large-scale structural variations, underscoring the 
co-localization of kataegis foci with structural variations. This also highlights that 
not all foci of kataegis co-localized with structural variations and not all structural 
variations were associated with kataegis. 

Sites of amplification represent a potential source of false variant calls. If the 
amplification occurred early in the evolution of a cancer, then there is an increased 
likelihood of substitutions accumulating randomly within the amplified genomic 
region. When mapped back to the reference genome, these will appear as clustered 
variants. 

A number of features allow us to distinguish such events from ‘true’ kataegis. 
These mutations would not be expected to have features associated with kataegis, 
such as the mutation type, predilection for a TpC sequence context and the 
processivity. Furthermore, if they have accumulated as random events in a 
multi-copy locus, then they would be less likely to occur in cis (on the same 
sequencing read) with respect to each other. In contrast, mutations which have 
occurred at the same time, during one moment of transient hypermutability in a 
single cell cycle event, would be expected to cluster on one copy of a multi-copy 
locus, to be in cis and to demonstrate approximately the same variant allele 
fraction. Finally, to achieve the level of hypermutation required to be called as 
a focus of kataegis (average intermutation distance of less than 1,000 bp for six 
consecutive mutations equivalent to ~ 1,000 substitutions per Mb), the degree of 
copy number amplification would have to be considerable. 

To examine this likelihood of false calls in regions of amplification, simulations 

were performed assuming background mutation rates of 10 per Mb, 40 per Mb 
and 100 per Mb for different copy number states and for different sizes of focal 
amplification. The expected number of kataegic foci for these different states are 
provided in Supplementary Table 5. For most of the samples in which kataegis 
was detected (all but twenty), a 10 Mb region of amplification would require a 
copy number state of 36 or above to generate 1 cluster of 6 mutations with an 
average intermutation distance of less than 1,000 bp. For 19 of the remaining 20 
samples, a 10 Mb region of amplification would require a copy number state of 10 
or above. For the single cancer with a mutation rate exceeding 40 per Mb, a copy 
number state of 4 is required to generate a cluster of mutations. As mentioned 
previously, these clusters would have to be processive, be in cis and have roughly 
the same variant allele fraction to be called as a focus of kataegis. 
Definition of kataegis. Kataegis has been identified via a PCF-based method as 6 or 
more consecutive mutations with an average intermutation distance of less than or 
equal to 1,000 bp. Other salient features include a preponderance for C>T and C>G 
mutations, a predilection for a TpC mutation context, processivity, evidence of having 
arisen on the same parental allele (being in cis) on sequencing reads and additionally 
(but not necessarily) co-localization with large-scale genomic structural variation. 
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GLUTAMATE RECEPTOR-LIKE genes 
mediate leaf-to-leaf wound signalling 


Seyed A. R. Mousavi', Adeline Chauvin’, Francois Pascaud*, Stephan Kellenberger® & Edward E. Farmer’ 


Wounded leaves communicate their damage status to one another through a poorly understood process of long-distance 
signalling. This stimulates the distal production of jasmonates, potent regulators of defence responses. Using non-invasive 
electrodes we mapped surface potential changes in Arabidopsis thaliana after wounding leaf eight and found that 
membrane depolarizations correlated with jasmonate signalling domains in undamaged leaves. Furthermore, current 
injection elicited jasmonoyl-isoleucine accumulation, resulting in a transcriptome enriched in RNAs encoding key 
jasmonate signalling regulators. From among 34 screened membrane protein mutant lines, mutations in several clade 3 
GLUTAMATE RECEPTOR-LIKE genes (GLRs 3.2, 3.3 and 3.6) attenuated wound-induced surface potential changes. 
Jasmonate-response gene expression in leaves distal to wounds was reduced in a gir3.3 gir3.6 double mutant. This 
work provides a genetic basis for investigating mechanisms of long-distance wound signalling in plants and indicates 
that plant genes related to those important for synaptic activity in animals function in organ-to- organ wound signalling. 


Unlike plants, animals rely on rapid nervous systems to escape pre- 
dation. A stationary fly that perceives danger takes less than 300 ms to 
take off, and this process requires complex whole-body coordination’. 
Nevertheless, this escape response is too slow if the fly lands on a 
Venus flytrap, a plant in which electrical signals initiate rapid trap 
closure’. Whereas fast movements associated with insect capture are 
exceptional, slower herbivore-induced defence gene expression is 
widespread in plants and is coordinated between organs° . What, then, 
is the nature of the long distance signal(s) that leads to defence res- 
ponses throughout much of a plant body after wounding? Among the 
many scenarios proposed to explain the nature of systemic wound 
signals in plants* is a role for electrical signalling’. However, this has 
not been substantiated and it is essential to identify genes that underlie 
this phenomenon. 

Resistance to herbivores depends to a large extent on the production 
of potent regulatory lipids known as jasmonates®. Without the ability 
to produce or perceive these compounds, plants that normally resist 
attack become remarkably vulnerable to predation’. Both jasmonic 
acid (JA) and biologically active jasmonoyl-isoleucine (JA-Ile)* accu- 
mulate within minutes in wounds and in undamaged distal tissues’. 
Similarly, when feeding on A. thaliana, the Egyptian cotton leafworm 
(Spodoptera littoralis) stimulates jasmonate-regulated transcription in 
tissues several centimetres from a wound” and when feeding on bean, 
these insects provoke plasma membrane depolarizations that spread 
through entire wounded leaves’*. Such plasma membrane depolariza- 
tions are common in plants’*"* and are also produced after exposure 
of cells to damage-associated molecular patterns’®, including peptide 
danger signals'’. Moreover, treatment of tomato cells with ionophores 
that cause plasma membrane depolarization stimulated the expression 
of jasmonate-regulated genes'*'’. Related to this, membrane depolar- 
izations in potato preceded increases in cytosolic Ca”* , and jasmonate 
accumulation was reduced when these Ca”~* transients were blocked”. 
Here, concentrating exclusively on the jasmonate defence pathway 
and using non-invasive surface electrodes*', we monitored changes 
in electrical activity due to ion fluxes in cell populations in wounded 
Arabidopsis leaves. We show that electrical signals activate jasmonate 


biosynthesis in leaves distal to wounds and identify genes involved in 
the propagation of these signals. 


Wound-activated surface potential changes 


To investigate patterns of electrical activity and gene expression in 
5-week-old rosettes, individual leaves were numbered from oldest to 
youngest. Electrodes placed on leaf 8 at the midrib (el electrode 
position), midrib/petiole junction (e2) and on the petiole (e3) did 
not detect changes in electrical activity and such changes were not 
elicited by walking S. littoralis larvae (Fig. la~c and Extended Data 
Fig. 1a, b). When the larvae began to feed, wound-activated surface 
potential changes (WASPs) of variable amplitude, duration and com- 
plexity were observed (Supplementary Video 1 and Extended Data 
Fig. 1c). Because insects release chemical elicitors in addition to caus- 
ing wounding’, we investigated the effects of mechanical wounding on 
electrical activity. Simply touching the leaf did not generate changes in 
surface potential, but wounding the leaf tip resulted in strong and 
reproducible surface potential changes (Fig. 1b, c). When recordings 
were extended, they often showed periodicity (Extended Data Fig. 1d). 
We used three parameters to characterize these signals: latency (time 
from wounding to arrival at the amplitude midpoint), amplitude and 
duration (Fig. 1b). To gain more information on the spread of WASPs 
within a wounded leaf, four electrodes were placed on the leaf surface 
(Fig. 1a). After damage, WASPs were detected first at el, then several 
seconds later at e2, and finally at e3. An electrode on the lamina also 
detected damage-elicited electrical activity and, in each case (Fig. 1c), 
the signals we measured had the same polarity as those produced after 
chilling, a treatment known to cause plasma membrane depolariza- 
tion”*. Therefore WASPs in leaf 8 were due to plasma membrane 
depolarization (Extended Data Fig. le-g). 


WASP territories and speeds 

Signals generated by wounding leaf tips first move towards the centre of 
the rosette and then disperse away from the apex into a restricted num- 
ber of distal leaves to initiate distal JA accumulation and signalling’®. To 
map the spatial distribution of WASPs in the rosette after wounding 
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Figure 1 | WASPS and JAZ10 expression map to identical spatial domains. 
a, Experimental design for detecting surface potential changes on leaves. 
Measuring electrodes e1, midrib; e2, petiole/midrib junction; e3, petiole. The 
lamina electrode (eL) was 3mm from el. The apical part of the leaf was 
wounded with forceps. b, Three distinct variables, latency (Lat.), duration and 
amplitude of WASPs, were analysed. The signal in the wounded leaf (leaf 8, e3) 
typically did not recover to baseline during recording (unfilled arrowhead). 
Time of wounding is indicated with a filled arrowhead. c, Typical surface 
potential changes recorded on leaf 8. Arrowheads indicate when the leaf was 
touched or wounded. d, Representative W ASPs generated on distal leaves after 
wounding (W) leaf 8 (n = 10-61). Two types of surface potential change were 
observed on leaf 6. The solid and dashed lines show traces for 63% and 37% 
(n = 19) of events, respectively. e, Levels of JAZ10 (+ s.d.) transcript in 
unwounded leaves (upper panel) and 1h after mechanical wounding of leaf 8 
(lower panel). *P < 0.05, **P < 0.01, ***P < 0.001. Numbers within bars show 
the product of amplitude and duration (mV s) for leaves that showed 
depolarizations. f, Heat maps for JAZ10 transcript induction 1 h after wounding 
leaf 8 and duration of surface potential changes produced after wounding leaf 8. 
Only leaves that were investigated are indicated. Data for JAZ10 levels are from 
e. WASP durations from Extended Data Fig. 2b. V, variable leaf. 


leaf 8 we placed electrodes on this leaf or in the e3 position on leaves 5 
through 18. The changes in amplitude observed in wounded leaf 8 were 
typically close to —70 mV (Extended Data Fig. 2a) and the unwounded 
leaves 5, 11, 13 and 16 showed similar responses (Fig. 1d, Extended 
Data Fig. 2b). For example, after wounding leaf 8, a WASP with a 
duration of 78 + 20s anda peak amplitude of —51 + 9 mV was reached 
in leaf 13 after a latency of 66 + 13s (n = 61 plants). However, other 
leaves (7, 9, 10, 12, 14, 15, 17 and 18) showed small positive surface 
potential changes. For example, leaf 9 showed a 20 + 5 mV change in 
surface potential with a latency of 54+ 12s (n = 46 plants). Most of 
these observations fit a developmental pattern: in adult-phase Arabidopsis 
rosettes, leaf n shares direct vascular connections to leaves n + 5 and 
n + 8. Thus the wounded leaf 8 is connected to leaves 13 and 16, these 
connections being termed parastichies™. Additionally, leaves 5 and 11 
also showed strong negative surface potential changes after wounding 
leaf 8. These leaves are n + 3 relative to the wounded leaf 8 and may 
represent contact parastichies formed by proximal but unconnected 
vasculature™*. We also recorded changes in surface potentials in the 
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n — 2 leaf (leaf 6) that were similar to those in wounded leaf 8 in 63% 
recordings but the remaining recordings from this leaf (Fig. 1d) 
resembled traces from leaves such as leaf 9. We termed leaf 6 a variable 
leaf. 

Quantitative electrophysiological data (Extended Data Fig. 2b) 
was then compared with transcript levels for JASMONATE-ZIM 
DOMAIN 10 (JAZ10), a robust marker for activity of the jasmonate 
pathway”. One hour after wounding leaf 8 we detected =100-fold 
increases in JAZ10 transcript levels in leaves 5, 8, 11, 13 and 16 
(Fig. le). JAZ10 transcript induction in leaf 6, like WASP production, 
was variable. Heat maps from quantitative data showed that JAZ10 
expression at 1 h post-wounding and WASP durations covered ident- 
ical territories, spanning 137° of the rosette when variable leaf 6 
(n — 2) was included (Fig. 1f). 

We next examined the speed at which electrical signals moved 
within wounded leaves and from leaf to leaf. Replicated measure- 
ments indicated a range of speeds from 2.6 + 0.6cm min‘ between 
the wound and an electrode placed on the lamina, to up to9 cm min ' 
between electrodes placed on the midribs of the wounded leaf itself or 
at intervals along the midrib on leaf 13 (Extended Data Fig. 2c). The 
similar apparent velocities for surface potential changes in the midribs 
of wounded and distal leaves indicate that related mechanisms control 
electrical signalling in these leaves. However, signals from the wounded 
leaf seemed to slow to 5.4+ 1.5cmmin ' at the centre of the plant 
before accelerating again in the distal leaf, bringing the average signal 
speed from wounded leaf 8 to receiver leaf 13 to 5.8+1.1cmmin ' 
(n = 13). This overall velocity estimate is concordant with recent esti- 
mates of signal speeds based on JA accumulation in leaf 13 of 
Arabidopsis after wounding leaf 8 (ref. 26), and with self-propagating 
electrical activity elicited by wounding bean or barley leaves’’. To test 
whether WASPs could travel from young to older leaves we wounded 
leaf 13 and monitored events in leaf 8. Again we observed a correlated 
pattern of WASP production and JAZ10 expression (Extended Data 
Fig. 3). Then, to investigate whether the long-distance signals that 
activate jasmonate responses travel at similar speeds to WASP changes, 
electrodes were placed in positions e2 and e3 on leaf 8 and this leaf was 
wounded. When the wounded leaf 8 was severed between e2 and e3 
after a signal had reached e3 we detected induced JAZ10 expression in 
leaf 13. However, JAZ10 induction was not observed if we cut the 
wounded leaf as the WASP arrived at e2 but before it reached e3 
(Extended Data Fig. 4). We conclude that the long-distance signals 
that strongly activate JAZ10 expression travel at a speed similar to 
electrical events elicited by wounding. 


Current injection and the Arabidopsis transcriptome 


To test for a direct link between the jasmonate pathway and electrical 
activity we implanted platinum (Pt) wires into the petiole of leaf 8 
(Fig. 2a), injected current, and monitored the induction of surface 
potentials in the lamina of this leaf. By structuring the input current 
appropriately (40 LA for 10s; see Methods) we were able to induce 
surface potential changes distal to the site of injection (Fig. 2b) with- 
out causing detectable cell damage other than that due to Pt wire 
implantation in the petiole, a region that was removed before analys- 
ing the lamina (Extended Data Fig. 5a—d). The signals generated in 
response to current injection (CI) had a mean duration of 59 + 25s 
(Extended Data Fig. 5e), similar to the durations of WASPs in leaf 13 
when leaf 8 was wounded. From these data we estimated the apparent 
velocity of the surface potential change resulting from CI to be 
6.4+ 1.9cmmin_ ‘. This was close to an average action potential velo- 
city of 7cmmin ’ that has been observed after CI in Arabidopsis”. 
Because CI was shown to stimulate JA accumulation in tomato”, we 
measured the levels of both JA and JA-Ile after CI in leaf 8. Both 
compounds accumulated in response to this treatment (Fig. 2c, d). 
To confirm that CI could induce jasmonate signalling the expres- 
sion of two jasmonate-responsive genes, the regulatory gene JAZ10 
and VEGETATVE STORAGE PROTEIN 2 (VSP2), an anti-insect 
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Figure 2 | Current injection (CI) induces jasmonate accumulation, gene 
expression and surface potential changes in leaf 8. a, Placement of Pt wires, 
proximal electrode (eP), distal electrode (eD) and laminar electrode (eL) for C1 
experiments. The leaf blade to the left of the dashed line was used for transcript 
measurements and quantification of jasmonate levels. b, Surface potential 
generation following CI (40 1A, 10s). Art = artefacts of CI. Bar, duration of CI. 
c, Levels of JA (+ s.d.) 20 min and 1 h after CI. d, Levels of JA-Ile (+ s.d.) 20 min 


defence gene’, was monitored. Transcripts for both genes were upre- 
gulated in response to CI (Fig. 2e, f). Furthermore, the WASPs detected 
on wild-type plants were indistinguishable from those on wounded 
mutant plants that lacked the ability to synthesize jasmonates 
(Extended Data Fig. 6a-c). Therefore, the mechanism that produces 
WASPs is upstream or independent of jasmonate synthesis. When 
current was injected into the coronatine insensitive 1-1 (coil-1)"', a 
mutant lacking the functional jasmonate receptor, we recorded poten- 
tials similar to those in the current-injected wild type (Extended Data 
Fig. 6d). However, we were unable to induce JAZ10 expression in these 
plants (Extended Data Fig. 6e). The canonical jasmonate signal pathway® 
is therefore required for expression of JAZ10 after wounding and CI. 
Consistent with the detection of WASPs over entire leaf surfaces, plants 
expressing a wound-inducible VSP2 reporter (Fig. 2g) also responded to 
CI throughout the lamina (Fig. 2h). 

To find out if other jasmonate-regulated genes were activated by CI 
we performed whole-transcriptome analysis starting with current- 
injected leaf 8. This revealed that 313 genes were >twofold upregu- 
lated (Fig. 3 and Supplementary Table 1). We then generated a second 
data set from leaf 13 one hour after wounding leaf 8. Finally, these results 
were compared to data produced independently from wounded 18-day- 
old plants**. This comparison showed that 94% of the Cl-upregulated 
genes were also upregulated in leaf 13 of plants wounded on leaf 8.70% 
of Cl-induced transcripts were upregulated in both leaf wounding data 
sets. Strikingly, among these were 9 of the 12 Arabidopsis JAZ genes 
(Extended Data Fig. 7a). These genes are critical regulators of jasmo- 
nate signalling***>. 

It is known that more transcripts are upregulated than are down- 
regulated in response to wounding'****’. Consistent with this, the levels 
of only 66 transcripts decreased in response to CI. Of these transcripts, 
47% were also downregulated in one or both wounding experiments 
(Extended Data Fig. 7b). Thirteen genes were downregulated in all three 
treatments (Extended Data Fig. 7c). Clearly, CI does not affect the 
expression of all wound-regulated genes and this is consistent with the 
fact that several signal mechanisms operate to control gene expression in 
damaged plants*******, One such pathway depends on NADPH oxidase 
D (RBOHD) to transmit reactive oxygen species (ROS)-dependent long- 
distance signals at speeds similar to the WASP velocities we describe™. 
Although inhibitors of this ROS propagation pathway™ did not abolish 
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and 1h after CI. LOQ, limits of quantification (dashed line). e, Levels of JAZ10 
transcripts (+ s.d.) in leaves 1 h after wounding or CI (leaf 8 only). f, Levels of 
VSP2 transcripts (+ s.d.) in leaves 4h after wounding or CI (leaf 8 only). 

U, unwounded plants. W, wounded. ***P < 0.001. g, Expression pattern of a 
VSP2-GUSPlus reporter line in leaf 8 from an unwounded plant, and 4h after 
wounding. h, VSP2-GUSPlus activity in leaf 8 of control plants (no CI) and 4h 
after CI. 


WASPs (Extended Data Fig. 8a—c), one of them, diphenyleneiodonium 
(DPI), reduced WASP duration in leaf 13 by a factor of 2. However, DPI 
did not inhibit distal JAZ10 expression (Extended Data Fig. 8d) and 
wound-induced JAZ10 expression was not reduced relative to wild type 
in rbohD mutants (Extended Data Fig. 8e) in which WASP production 
was similar to that in the wild type (Extended Data Fig. 9). In conclusion, 
the propagating signal leading to distal JAZ10 expression is likely to be 
RBOHD- independent. 


GLR genes mediate long-distance wound signalling 

There is growing evidence that genes encoding various ion channels 
and pumps can affect jasmonate signalling**°°. For example, the over- 
expression of a GLUTAMATE RECEPTOR-LIKE (GLR) gene from 
radish stimulated the expression of jasmonate-regulated genes, including 


Leaf 8 after 
current injection 


Leaf 13 after 
wounding leaf 8 


> twofold 
P<0.05 


Rosette after wounding 


Figure 3 | Current injection and wounding stimulate the expression of a 
common JAZ gene-enriched subset of genes. Venn diagram for the number 
of transcripts upregulated more than twofold compared to unstimulated plants 
(P =0.05) for current-injected leaf 8 (40 1A, 10; this study), for leaf 13 of 
plants that had been wounded on leaf 8 (this study), and for wounded 
Arabidopsis rosettes**. Numbers in parentheses are upregulated JAZ genes. 
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Figure 4 | gir mutants reduce the duration of WASPs, and responses to 
current injection. a, Typical WASPs in wild type (WT), gir3.3a, glr3.6a and 
glr3.3a glr3.6a after wounding leaf 8. Dashed line, WASP duration. b, JAZ10 
expression (~ s.d.) after wounding leaf 8 in the WT, gir single mutants, and the 


VSP1, in Arabidopsis”. Using a reverse genetic approach based on 
monitoring electrical activity after wounding leaf 8, we screened 34 
homozygous Arabidopsis mutants in putative pumps and ion channels 
including GLRs (Extended Data Fig. 9). Mutations in 4 genes, GLR3.1, 
GLR3.2 (two alleles), GLR3.3 (two alleles) and GLR3.6 (two alleles), all 
of which reduced GLR expression (Extended Data Fig. 10), caused 
reduced durations of surface potential changes either in leaf 8, leaf 
13, or both (Fig. 4a and Extended Data Fig. 9). Two of the gir mutations, 
one that reduced WASP duration in leaf 8 (g/r3.3a), and one that 
reduced WASP duration in leaf 13 (gir3.6a), were combined to produce 
gir3.3a glr3.6a. This genotype, and a second related double mutant 
showed reduced electrical activity in wounded leaf 8 but, unlike the 
single mutants, changes in surface potential were no longer detectable 
in leaf 13 when leaf 8 was wounded (Fig. 4a and Extended Data Fig. 9). 
Additionally, although wounding caused elevated JAZ10 transcript 
levels in wounded leaf 8 of the wild type, in the gir single mutants, 
and in the gir3.3 gir3.6 double mutant, JAZ10 expression was reduced 
in distal leaf 13 of both gir single mutants (Fig. 4b). A stronger reduc- 
tion in wound-induced JAZ10 transcript level relative to the wild type 
was seen in distal leaf 13 of the gir3.3a glr3.6a double mutant (Fig. 4b). 
Finally, we were unable to stimulate electrical activity after current 
injection in the double mutant (Fig. 4c) and the elevation of JAZ10 
transcript levels seen in leaf 8 of the wild type after CI was almost 
abolished (Fig. 4d). 


Discussion 


We have identified genes involved in the propagation of electrical 
activity leading to defence gene expression. The findings are consistent 
with a previous report implicating electrical signalling in the distal 
activation of proteinase inhibitor gene expression in tomato seedlings”. 
The GLR genes we studied encode putative cation channels, and GLR3.3 
functions in agonist-stimulated plasma membrane depolarization**”. 
This gene’*, and several GLRs expressed in pollen”, can control cyto- 
solic Ca”* influxes, and GLRs have also been implicated in mediating 
calcium influxes in response to the perception of microbe-associated 
molecular patterns*’. Our results now show that GLRs control the distal 
wound-stimulated expression of several key jasmonate-inducible regu- 
lators of jasmonate signalling (JAZ genes) in the adult-phase plant. 
Finally, GLRs are related to ionotropic glutamate receptors (iGluRs) 
that are important for fast excitatory synaptic transmission in the ver- 
tebrate nervous system. They and their plant relatives may control 
signalling mechanisms that existed before the divergence of animals 


gir3.3a glr3.6a double mutant. c, Lack of surface potential generation in leaf 8 of 
the gir3.3a gir3.6a double mutant after CI. d, JAZ10 expression (+ s.d.) after CI. 
For b and d, RNA samples were collected 1 h after stimulation. 

**P < 0.01,***P < 0.001. 


and plants”. If so, a deeply conserved function for these genes might be 
to link damage perception to distal protective responses. 


METHODS SUMMARY 


Arabidopsis thaliana accession Col-0 and T-DNA insertion lines were obtained 
from the Nottingham Arabidopsis Stock Centre (NASC). Their homozygosity 
was confirmed before all experiments. Surface potentials were recorded with 
silver/silver chloride electrodes placed in 10 pil of 10 mM KCl in 0.5% (w/v) agar. 
The ground electrode was placed in the soil. Current injection was carried out 
via two platinum wires that were inserted into the leaf one day before the experi- 
ment. Quantitative PCR for JAZ10 (At5g13220) and VSP2 (At5g24770) was from 
reverse-transcribed RNA by SYBR Green assays and standardized to ubiquitin- 
conjugating enzyme (UBC21, At5g25760). For transcriptome analysis, amplified 
RNA was hybridized to Affymetrix ATH1 arrays. Probe sets showing at least a 
twofold change and a P value of < 0.05 were considered significant. Jasmonates, 
extracted according to ref. 10, were separated by high-performance liquid chro- 
matography (Phenomenex Kinetex; 2.6mm C18 100 A column) and quantified 
by electrospray ionization mass spectrometry in the multiple reaction monitoring 
mode. 


Online Content Any additional Methods, Extended Data display items and 


Source Data are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


Plant material and growth conditions and bioassays. Arabidopsis thaliana 
(Columbia) were soil-grown (one seed per 7-cm diameter pot) for 5 to 6 weeks 
with 10h light (100 WEs-'m~*), 70% humidity; day 22°C, night 18°C. One 
single wounding experiment was carried out per plant. Wounds were inflicted 
with plastic non-locking thumb forceps that had flat, 4-mm_-wide ridged tips. The 
space between each ridge was 1 mm with an inter-ridge depth of 0.6 mm. Wounds 
were inflicted with these ridges parallel to the long axis of the leaf. The first wound 
was made at the leaf tip and the second wound was made so that it abutted the 
first, and so on until 40-50% of the leaf was wounded. The wounding procedure 
takes less than 10 s to complete. Prior to electrophysiology experiments plants were 
moved into a Faraday cage under the same light conditions. Spodoptera littoralis 
(4th instar larvae) were placed on plants or, alternatively, the apical parts of the 
leaves were crushed with plastic forceps. A transparent plastic support was used to 
stabilize the wounded leaf during the experiments. Leaves (excluding cotyledons) 
were numbered from old to young. 

Surface potential recordings and current injection. For surface potential 
recordings, silver electrodes 0.5 mm in diameter (World Precision Instruments) 
were chloridized with HCI (0.1 M), stored at room temperature and rechloridized 
after several uses. Experiments were conducted in a controlled environment room 
without changing the growth conditions. Two 2-channel amplifiers (FD 223 and 
Duo 773, World Precision Instruments) were simultaneously used to record the 
surface potential at four positions. The electrode-leaf interface was a drop (10 1l) 
of 10 mM KCl in 0.5% (w/v) agar placed so that the silver electrode did not contact 
and damage the cuticle. The inter-electrode distance was the distance between the 
nearest edges of these agar droplets. The ground electrode was placed in the soil. 
The procedure for data quantitation is shown in Fig. 1b in the main text. Latency 
is the period between wounding and WASP detection. Amplitude was relative to 
the baseline before wounding. Duration is the time between amplitude change 
midpoints. For experiments on interrupting signals, ceramic scissors (CS-250 
Kyocera) were used. For current injection two platinum wire electrodes (Advent 
Research Materials), 0.1-mm diameter were inserted in the midrib 1 cm apart 
(Fig. 2a) so that the end of the wire was visible from the abaxial leaf side but did 
not make contact with the soil. After insertion of the Pt wires the plants were rested 
for 24h before experiments. The two Pt wires were connected to a homemade 
current source that was controlled by the acquisition program. Current was 
injected between the two Pt wires and the wire closest to the leaf lamina served 
as the positive electrode. To optimise current injection to generate surface potential 
changes we used the experimental setup shown in Fig. 2a in the main text. Current 
was injected into the petiole and surface potentials were measured with an elec- 
trode placed on the midrib in position eD. Combinations of 10, 20 or 40 1A for 1 or 
10s were tested. Only injecting 40 1A for 10 s led to reproducible surface potential 
changes in the lamina. Control leaves (no CI) carried Pt wire implants but were not 
subjected to CI. Control measurements indicated that during the injection of 40 pA 
the voltage difference between the two platinum electrodes was 12.7 + 0.9V 
(n = 9). Surface potentials were recorded as described above. In the combined 
experiments, the Chartmaster program via the InstruTECH LIH 8+8 interface 
(HEKA Electronic) was used to record the induced surface potential changes and 
to control the time and duration of current injection. In the experiments without 
current injection, surface potentials were recorded with Datatrax2 software via the 
LabTrax-4/16 interface (World Precisions Instruments). The sampling interval 
was 10 ms. Control plants were implanted with Pt wires in all current injection 
experiments. Trypan blue staining’ was used to assess the extent of damage caused 
by implanting Pt wires in petioles. The wires were implanted 24h before staining. 
For f-glucuronidase (GUS) reporter plants the current injected leaf and the cog- 
nate control leaf were harvested for GUS staining for 15 h at 37 °C. The tissue was 
destained in 70% ethanol”. 

Ion leakage. To investigate the effects of current injection on ion leakage, Pt 
current injection wires were inserted into petioles and the plants were left over- 
night before current injection (40 1A, 10s). Three surface electrodes (eP, eD and 
eL, placed in 0.5% w/v agar droplets (10 pil) containing 10 mM KCl) were placed 
on the lamina (see Fig. 2a) and used to monitor current-induced depolarizations. 
As controls, another set of plants were implanted with Pt wires and surface 
electrodes but not subjected to current injection. After current injection plants 
were incubated in the light for 1h then the current-injected leaves and control 
leaves were cut off at the base of petiole. Before conductivity measurements the 
leaf surfaces were briefly washed with water to remove the agar droplets and 
petioles were attached so that only the part of the lamina that is harvested for 
JAZ10 measurements (see Fig. 2a) came into contact with deionised water (25 ml; 
3 leaves per measurement). After 20 min gentle agitation the leaves were removed 
and the conductivity of the water was measured at 22°C with a Hanna 
Instruments EC215 conductivity metre (Distrelec). Positive controls were leaves 
infiltrated on either side of the midrib of the abaxial lamina, each time with 10 pl 
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of 1% (v/v) Triton X-100, 1 h before ion leakage analysis. Negative controls were 
untreated leaves and an additional control for Triton X-100 infiltrations was 
water (25 ml) into which 20 ul of the 1% Triton-X-100 solution was added. 
Pharmacological treatments. LaCl;, catalase and diphenyleneiodonium (DPI) 
were from Sigma. Solutions were made in water; the DPI solution contained in 
addition 1% (v/v) dimethyl sulphoxide (DMSO). The abaxial surface of each side of 
the main vein of leaf 8 was infiltrated with the inhibitor (10 pl) in the part that was 
later wounded. 25-30 min after infiltration leaf 8 was wounded and WASPs were 
recorded on leaf 13. Control experiments were carried out in exactly the same way 
with infiltration solutions that contained only the corresponding solvent. A similar 
experimental design was used for JAZ10 expression analyses. Compounds were 
infiltrated into leaf 8 (two 10 ll infiltrations, one each side of the abaxial lamina). 
30 min after infiltration leaf 8 was wounded and RNA was harvested 1h later. 
Quantitative PCR. Total RNA was extracted with an RNeasy Plant Mini Kit 
(Qiagen) or with DNA-free RNA isolation protocols**. Total RNA (1 pg) was 
copied into complementary DNA with M-MLV Reverse Transcriptase, RNase H 
Minus, Point Mutant (Promega) first-strand synthesis system and oligo(dT) pri- 
mers according to the manufacturer’s instructions. Quantitative PCR (qPCR) 
analysis was performed on 100 ng of cDNA in a final volume of 20 ll according 
to the FullVelocity SYBR Green instruction manual (Stratagene) or with a home- 
made master mix containing GoTaq polymerase (Promega) and its buffer, 0.2 mM 
dNTPs, 2.5 mM MgCl, ROX dye and SYBR green in a final volume of 20 tl. GPCR 
was performed in an Mx3005P spectrofluorometric thermal cycler (Stratagene). 
The data were calibrated to unwounded wild type. Ubiquitin-conjugating enzyme 
(UBC21) At5g25760” was used as reference gene. The thermal cycle conditions 
were: an initial denaturation at 95°C for 2 min, followed by 40 cycles of 20s at 
95 °C, 30s at 60 °C and 45 s at 72 °C. Three or four biological replicates were used 
for each experiment. Primers used were: UBC21 (At5g25760) forward 5’-CAG 
TCTGTGTGTAGAGCTATCATAGCAT-3’, reverse 5’-AGAAGATTCCCTGA 
GTCGCAGTT-3’; JAZ10 (At5g13220) forward 5’-ATCCCGATTTCTCCGGTC 
CA-3’, reverse 5’-ACTTTCTCCTTGCGATGGGAAGA-3'; VSP2 (At5g24770) 
forward 5’-CCGTGTGCAAAGAGGCTTA-3’, reverse 5’-CACAACTTCCAAC 
GGTCAC-3'; GLR3.1 (At2g17260) forward 5'-GGCCAAGAATTCACCAGAT 
GC-3', reverse 5'-GACCAAGAATCGCGGTTGACA-3’; GLR3.2 (At4g35290) 
forward 5'-ATTCACCAGAAGTGGCTGGG-3’, reverse 5’-TGAAGCTGTCCG 
GTTTCTGA-3’; GLR3.3 (Atlg42540) forward 5’-CGACCTTTCAACCGTCTT 
AT-3', reverse 5'-TCGAGAAGCTAAACCAGAGAA-3’; GLR3.6 (At3g51480) 
forward 5’-GATTAGAAGTGGGTTGGGGGA-3’, reverse 5'-GAGGCAATGGT 
GGAGGAAGT-3’. 

Transcriptomics. Total RNAs from leaves were isolated and purified with RNeasy 
Plant Mini Kit (Qiagen). All RNA quantities were assessed with a NanoDropND- 
1000 spectrophotometer and the RNA quality was assessed using RNA 6000 
NanoChips with the Agilent 2100 Bioanalyzer (Agilent). For each sample, 300 ng 
of total RNA were amplified using the MessageAmp II-Biotin Enhanced Single 
Round aRNA Amplification Kit (AM1791, Ambion). 12.5 ug of the resulting 
biotin-labelled complementary RNA was chemically fragmented. Affymetrix 
ATHI (batch 1211501) arrays (Affymetrix) were hybridized with 11 ug of frag- 
mented target, at 45 °C for 17h and washed and stained according to the protocol 
described in Affymetrix GeneChip Expression Analysis Manual (Fluidics protocol 
FS450_0007). The arrays were scanned using the GeneChip Scanner 3000 7G 
(Affymetrix) and raw data was extracted from the scanned images and analysed 
with the Affymetrix Power Tools software package (Affymetrix). Statistical analysis 
was performed using the free high-level interpreted statistical language R and various 
Bioconductor packages (http://www.Bioconductor.org). Hybridization quality was 
assessed using the Expression Console software (Affymetrix). Normalized expres- 
sion signals were calculated from Affymetrix CEL files using the RMA normalization 
method. Differential hybridized features were identified using the Bioconductor 
package “limma” that implements linear models for microarray data’*. The P values 
were adjusted for multiple testing with Benjamini and Hochberg’s method to control 
the false discovery rate (FDR). Probe sets showing at least twofold change and a 
P-value < 0.05 were considered significant. 

JA and JA-Ile quantification. Isopropanol and methanol were obtained from VWR 
Prolabo and used for extraction analysis. Liquid chromatography mass spectrometry- 
grade acetonitrile and water from Biosolve were used for the high-performance 
liquid chromatography. The internal standards used were ['*O],jasmonic acid*? 
and ['*C].jasmonoyl L-isoleucine®". Frozen leaves (200 mg, from 5-week-old plants) 
were ground in a ball mill extractor with internal standards (40 ngml~') before 
extraction with isopropanol'®. Chlorophyll was removed with a C18 solid-phase 
extraction cartridge using methanol:H3O (85:15, v/v) for elution. The eluate was 
concentrated and dissolved in 100 pl methanol:H,O (85:15, v/v). Separation was 
carried out on a Phenomenex Kinetex 2.6-mm C18 100-A column (100 X 3.0mm). 
Gradient elution was at a flow rate of 0.4 ml min‘ with the following solvent system: 
A= 0.1% formic acid/water, B = 0.1% formic acid/acetonitrile; 5% B for 3 min, 5-75% 
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Bin 11 min, 75-95% B in 2 min, 95% B for 2 min and 95-5%B in 2 min. The electro- 
spray ionisation conditions were as follows: capillary voltage 3,300 V; cone voltage 
24 V; extractor 3 V; radio frequency (RF) lens 0 V; source temperature 120 °C; deso- 
Ivation temperature 350 °C; cone gas flow 9001h” ' and desolvation gas flow 27 1h” ’. 
Jasmonates were monitored with quantitative multiple reaction monitoring (MRM) in 
a Quattro micro API mass spectrometer (Waters) with an electrospray ionization 
interface coupled with the Agilent LC system (Hewlett Packard). Detection was per- 
formed in negative ion mode over an m/z range of 100-1,000. The MRM transitions 
were: JA: 209.1 > 58.7, ['8Os]JA: 213.1 > 62.8, JA-Ile: 322.2 > 130.0 and JA-['°C]¢lle: 
328.2 > 136.0 (parent > daughter). The limit of quantification (LOQ = 3X limit of 
detection) could reach up to 9.2 pmol g! fresh weight (FW) for JA and 4.5 pmol g! 
FW for JA-Ile. Data below LOQ were considered as non-informative. 
Genotyping of T-DNA insertion lines. T-DNA insertion lines were obtained 
from the Nottingham Arabidopsis Stock Centre (NASC) except for the respir- 
atory burst oxidase homologue D mutant rbohD (Salk_070610) which was from 
Y. Lee (T-DNA line) and F. Mauch (dSpm line*). For genotyping, 5 mg fresh leaf 
samples were placed into 96-well microtitre plates and tissues were ground using 
a Qiagen TissueLyser II (Retsch Technology). Then, 60 pl of extraction buffer 
(200 mM Tris HCl pH7.5, 250mM NaCl, 25mM NajEDTA, 0.5% SDS) was 
added to each well and the samples were centrifuged at 4,000g for 10 min. The 
supernatants were transferred into new microtitre plates and the same volume 
(50 ul) of isopropanol was added. The plates were then centrifuged at 4,000g for 
5 min. The resultant pellets were washed with 70% ethanol (150 ul) and centri- 
fuged at 4,000g for 5 min. Finally, DNA was resuspended in 50 il deionized water. 
2 wl of this extracted DNA was used as template for each final 20 tl PCR reaction. 
Sequences of primer pairs used for genotyping of T-DNA insertion lines: 

glr1.1 (At3g04110, salk_057748) forward, 5'-ACCTCTTGACGCGTATGA 
AAG-3’; reverse, 5'-GTGAAAAAGAAAAGCCAAGGG-3’, girl.4 (At3g07520, 
salk_129955) forward, 5'-TATATTTGGCCAAGCTCAACG-3’; reverse, 5’-CT 
TATAGTGCGGGCTTTGTTG-3’. 

glr2.3 (At2g24710, salk_113260) forward, 5'-TATTTGCGGAAGTTCCATT 
TG-3'; reverse, 5’-AGAGCGACAAGAAACAGAACC-3'. gir2.7 (At2g29120, 
salk_121990) forward, 5'-GGAAATCTTGCCGGTTAAAAG-3’; reverse, 5’-AC 
AAATTTGGGGACATTAGGG-3’. 

glr2.8 (At2g29110, salk_111695) forward, 5’-GAGTACCTTTCCCTGACCC 
TG-3'; reverse, 5'’-GAAGGGAGGAGAAGAATGGTG-3". gir2.9 (At2g29100, 
salk_125496) forward, 5'‘-TGACAAGGTGCTCCCATTATC-3’; reverse, 5’-AG 
AAATTCATGGTGACGGTTG-3’. 

gir3.1 (At2g17260, salk_063873) forward, 5'-AGATGAACAAACGTGACCA 
CC-3'; reverse, 5'-TGGCTTTTTGTGGTTCTGATC-3’. gir3.2a (At4g35290, 
salk_150710) forward, 5’-TTTTGGATCCAGCATTAGTCG-3’; reverse, 5’-TTT 
TGCGGTTTTGTTTGTAGG-3’. 

glr3.2b (At4g35290, salk_133700) forward, 5’'-TCCATTACTCAATTTCGGT 
GG-3’; reverse, 5’-AAACCCAAACCAAAATCATCC-3". gir3.3a (Atlg42540, 
salk-099757) forward, 5'‘-GATGCTGCATATGGTTGTGTG-3’; reverse, 5’-GT 
TGAACGATAAGCTTGCGAG-3’. 

glr3.3b (At1g42540, salk_077608) forward, 5'-TGCTGTTGATCTCTTGCAA 
TG-3'; reverse, 5’-CACACAACCATATGCAGCATC-3’. glr3.4 (Atlg05200, 
salk_079842) forward, 5'-GGGTTAATCCGGCTTATGAAG-3’; reverse, 5’-GA 
AGTGAGACTGGCCGTGTAG-3’. 

glr3.5 (At2g32390, salk_035264) forward, 5'-TGAAGTTGCTGCAAATGTG 
AG-3’; reverse, 5'-TGTCGACATGTCCACAGCTAG-3’. gir3.6a (At3g51480, 
salk_091801) forward, 5'-TTCGTTCAAAGGTGGCATAAC-3’; reverse, 5’-CG 
ACTATGAGGAAAGACGCAG-3’. 

glr3.6b (At3g51480, salk_035353) forward, 5'-ATAGTCGGTGCTGTCATTT 
GG-3'; reverse, 5'-TCCCCAAAAGCTCTTAAGCTC-3’. cngc12 (At2g46450, 
salk_092622) forward, 5'-ATTGATGCATTGAAGTCAGGG-3’; reverse, 5'-TA 
CTTTGGTTTCGAAGCTTGC-3’. 

cngc18 (At5g14870, sail_191_H04) forward, 5'-GTTTATCGCCAAGACTGC 
TTG-3’; reverse, 5'-TAGCATCTCATTCACCGGATC-3’. cngc20 (At3g17700, 
salk_129133) forward, 5’-AAAACAGTTACCTGGAAGCCC-3’; reverse, 5'-TG 
CCTTTACACCACCTTTTTG-3’. 

aca11 (At3g57330, salk_121482) forward, 5’-TTGCCTCACAAATTACGTTT 
TG-3’; reverse, 5’-ACAAACTCCCACGTTTGACAG-3’. clc-b (At3g27170, 
salk_027349) forward, 5’-TCAACCCGTGGAGTTCTGTAG-3’; reverse, 5’-GG 
AATTCTTGGGAGCCTGTAC-3’. 

cle-e (At4g35440, salk_142812) forward, 5'-ACAAAGAACAAAAATTGGC 
CC-3'; reverse, 5’-CTCAACCAATCTGAGGAGCTG-3’. kabl (At1g04690, 
salk_030039) forward, 5'-GAGGGAATAGCTCCCTTGTTG-3’; reverse, 5’-GA 
TGTGAAAGAAGCGAAATCG-3’. 

akt6 (At2g25600, salk_136050) forward, 5’-GAGAGGAAGAAGAAGCCTT 
GC-3'; reverse, 5’-ATGGTCAGCAACATCATCCTC-3’. skor (At3g02850, 


salk_097435) forward, 5'-CCCATATCTCACTGGTTCACC-3’; reverse, 5’-CC 
AAACTTCAGCGAAACAGAG-3’, 

tpk1 (At5g55630, salk_146903) forward, 5’-AAATGTCGAGTGATGCAGC 
TC-3'; reverse, 5'-TCAAGTTGCTCGAACTCATCC-3’. tpk3 (At4g18160, 
salk_049137) forward, 5'-ATTGATTACAGCCATTGCTGG-3’; reverse, 5’-CC 
GTATATCTCCATTCGGAAC-3’. 

annaté (At5g10220, salk_043207) forward, 5’-TTCTATCCACTGTAGACAG 
CCTG-3’; reverse, 5’-AATACGCATCTCTCTCCGTTG-3’. pen3 (Atl1g59870, 
salk_110927) forward, 5'-GCGAGAGTTGGACTCACTTTG-3’; reverse, 5’-TC 
ACCCAACTAAATCCTCACG-3’. 

vha-el (At4g11150, salk_019365) forward, 5'-AAGAGTTGGTCCTTGGAAA 
GC-3’; reverse, 5’-GTAGATCGGATTTTCACGACG-3". vha-g (At3g01390, 
salk_087613) forward, 5’-GCTGTTACAATCGCTGAAAGC-3’; reverse, 5'-TT 
GAGCTTCTACCTCAGCAGC-3’. 

vha-a2 (At2g21410, salk_142642) forward, 5’-ACCTCTGGCTCAAAATTGT 
CC-3'; reverse, 5’-TCCACATGAATATAGCCCGAG-3’. ahal (At2g18960, 
salk_118350) forward, 5'’-TTCGATTCTCCCACACAGATC-3’; reverse, 5'-AC 
GGATTGTGATTGAGACTGC-3’. 

aha2 (At4g30190, salk_073730) forward, 5'-GCGAAAACATATGAACTTTC 
GAC-3’; reverse, 5’-CTTAGGGAGCTGCACACACTC-3’. aha3 (At5g57350, 
sail_810_C08) forward, 5'-GTAGATTGCAACGGCTATTGC-3’; reverse, 5’-TT 
GTCGTGAAGAAGCTATGGC-3’. 

aha11 (At5g62670, salk_152723) forward, 5'-ATGACAGCGATTGAGGAAA 
TG-3'; reverse, 5’-GGCAAAACAACATCATTGATG-3’. rbohD (At5g47910, 
salk_070610) forward, 5'-TTTCAACGCCTTTTGGTACAC-3’; reverse, 5’-GT 
TACCTATTCTTTTGCCGGG-3’. 
RT-PCR analysis of gir3.3 and gir3.6 mutants. Total RNA was extracted with 
DNA-free RNA isolation protocols*®. Total RNA (1 [1g) was copied into cDNA with 
M-MLYV Reverse Transcriptase, RNase H Minus, Point Mutant first strand synthesis 
system (Promega, Madison WI) and oligo(dT) primers according to the manufac- 
turer’s instructions. Ubiquitin-conjugating enzyme (UBC21) At5g25760” was used 
as the reference gene. Three biological replicates were used for each experiment. 
Primers used were: for gir3.3a, forward 5'-GITGAACGATAAGCTTGCGAG-3’ 
and reverse 5'-GATGCTGCATATGGTTGTGTG-3’ and for gir3.3b forward 5’- 
CACACAACCATATGCAGCATC-3’ and reverse 5’-TGCTGTTGATCTCTTGC 
AATG-3'. For glr3.6a forward 5'-TTCGTTCAAAGGTGGCATAAC-3' and 
reverse 5'-AGTTGCAGCGACTTGAACCA-3’. 
VSP2p,o:GUSPlus plant transformation. The VSP2 (At5g24770) promoter, 
amplified using 5'-TTCTCTCTGGTTATATTTTGTTGCTG-3’ and 5'-TGTTT 
ATATGTGTGACGCAAAGG-3’ primers) was cloned with Xmal and KpnI (New 
England Biolabs) into the pUC57-L4-KpnI/XmalI-R1 plasmid producing a pEN-L4- 
VSP2p,o-R1 as an pENTRY clone. The pUC57-L4-KpnI/XmalI-R1 plasmid was gene- 
rated by J. Vermeer by introducing L4-KpnI/XmaI-R1 att recombination and 
restriction sites into pUC57 (Invitrogen). pEN-L1-GUSPlus-L2 plasmids were 
obtained with Gateway technology according to manufacturer instructions 
(Invitrogen) with GUSPlus cDNA (amplified from pCAMBIA1305.2 (CAMBIA) 
and pDONR/ZEO (Invitrogen). The final VSP2p,.-GUSPlus constructs were 
generated by using a double Gateway reaction into pEDO097pFR7m24GW. 
pEDO097pFR7m24GW was generated by inserting the FAST (fluorescence- 
accumulating seed technology) cassette”’ into pH7m24GW (Invitrogen) by E. M. N. 
Dohmann. Wild-type plants were transformed using Agrobacterium tumefaciens cells 
as described previously™. Transformed seeds expressing red fluorescence protein 
(REP) were selected by florescence microscopy. The T, generation was used for 
experiments. 
Statistics. All results in main figures and extended data with error bars are 
represented as mean + s.d. according to standard methods using Microsoft 
Excel. Standard error was used for Extended Data Fig. 8d. The P values were 
generated with Student’s one-tail unpaired t-tests except for Fig. 4d for which a 
two-tail unpaired was used. For qRT-PCR experiments, three technical replicates 
were used (two technical replicates for Extended Data Fig. 4c) and the biological 
replicates were indicated as ‘n’ in the figures. The technical replicates that had 
= 0.5 difference from mean C, were excluded. 
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Extended Data Figure 1 | Insect- and mechanical-damage-induced 
membrane depolarizations. a, The setup showing the ring cage around the 
insect (S. littoralis) and the position of the recording electrodes (e2 and e3) on 
leaf 8. b, Surface potential recording from electrode e2 while S. littoralis walked 
on the leaf. c, Typical surface potential changes recorded on electrode e3 during 
S. littoralis feeding. The arrowheads indicate periodicity in the signal. d, A 
proportion of WASPs induced by mechanical damage show periodicity. Filled 
arrowhead, time of wounding. The apical 40% of leaf 8 was wounded with 


forceps. Periodicity (unfilled arrowheads) was seen in 61% (n = 110) of 
experiments. e, Chilling-induced depolarization generated by gently placing 
water (150 ul, 0 °C) onto leaf 8 at the time indicated with the arrowhead. 
Chilling induced a change in surface potential in 3 out of 7 recordings. f, Typical 
WASP of the same polarity. For d, e and f the recording electrode was on leaf 8 
at position e3 (Fig. 1a in the main text). g, Amplitude of the change in surface 
potential (= s.d.) induced by wounding or by cold water. 
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Extended Data Figure 2 | Apparent heterogeneity in WASP velocities. estimating the apparent velocities of signals that travel within the wounded leaf. 
a, WASP characteristics in wounded leaf 8. b, Wound-activated surface For leaf-to-leaf recordings, leaf 8 was wounded and recordings were made both 


potential changes in leaves 5, 9, 11, 13 and 16. Leaf 8 was wounded and surface __ on this leaf and on leaf 13. Analysis of variance (ANOVA) followed by 
potentials were monitored in distal leaves with electrodes placed on theseleaves _ Bonferroni post-hoc test showed that the WASP speed indicated in cm min ~ 
at position e3’. For leaf 8 the monitoring electrode was at position e2. W, along the midrib and petiole within a leaf was not significantly different 
wounded; x, number of experiments in which amplitudes of surface potentials — between leaves 8, 12 and 13, but was faster than the overall signalling speed 
exceeded — 10 mV. Values are means = s.d. ¢, Leaf-to-leaf signal speeds. Leaves __ from leaf 8 to leaf 13, and the signal speed from the wound to the lamina 

8 or 12 (the largest rosette leaves in 6-week-old plants) were chosen for electrode (eL). 
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Extended Data Figure 3 | Wounding young leaves triggers WASPs and amplitudes (+ s.d.) after wounding of leaf 13. d, WASP durations (+ s.d.) after 


JAZ10 expression in older leaves. a, Electrode placements on leaves 8 (e3),9 | wounding of leaf 13. e, JAZ10 expression 1 h after wounding leaf 13 (+ s.d.). U, 
(e4) and 13 (e5). b, Typical changes in surface potential in leaves 8, 9 and 13 unwounded leaves; W, wounded leaf 13. ***P < 0.001 (+ s.d.). 
after wounding leaf 13. Arrowhead shows the time of wounding (W). c, WASP 
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Extended Data Figure 4 | Effects of interrupting WASP propagation on 
JAZ10 expression. a, Experimental design: electrodes were placed on the 
midrib (e2) and petiole base (e3) of leaf 8, and on leaf 9 (e4) and leaf 13 (e5). 40% 
of leaf 8 was wounded. b, WASP traces for leaves 9 (non-parastichious) and leaf 
13 (connected) provoked by wounding leaf 8. The first pair of traces was 
recorded when leaf 8 was severed upon detection of a signal at e2 and before a 
WASP was detected at e3. The second pair of traces was recorded when the 
WASP generated by wounding leaf 8 was allowed to reach e3 and the leaf was 
then severed immediately. c, JAZ10 expression in unwounded leaves (U), 
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wounded leaf 8 (W) and leaves 9 and 13. Left of dashed line: JAZ10 levels in 
leaves 8, 9 and 13 of intact control plants 1 h after wounding leaf 8. Right of the 
dashed line: plants in which the wounded leaf 8 was severed when WASPs were 
detected at e2 but were not allowed to reach electrode e3 (cut no WASP) or 
when WASPs were allowed to reach e3 before severing leaf 8 (cut WASP). 
***P < 0.001 (+ s.d.). Note: compared to crush-wounding, severing the 
petioles of otherwise undamaged leaves with sharp blades does not activate 
jasmonate signalling strongly in distal leaves. 
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Extended Data Figure 5 | Current injection does not cause cell death in the 
lamina but elicits surface potential changes. a—c, Trypan blue staining. 

a, Undamaged leaf. b, Pt wires inserted but no current injected. c, Current- 
injected leaf. Leaves were harvested 1 h after current injection. Cells were killed 
around the Pt wires (arrowheads) but CI did not cause increased staining of the 
lamina. Scale bars in boxes, 200 um. d, Ion leakage analysis after current 
injection (CI). For controls leaves were either untreated or implanted with Pt 
wires and connected to three surface electrodes on the laminas (no CI). A 
further set of leaves was prepared identically but subjected to CI (40 LA, 10s; 
CI) and harvested 1h later for conductivity analyses. Positive controls: leaves 


e 
Electrodes Latency(s) Amplitude (mV) Duration(s)  x/n 
Lamina(eL) 22+9 -87 +21 47+21 33/47 
Midrib (eP) 6+3 -79 +12 75 +20 44/47 
midrib (eD) 15+3 -80 + 24 52+ 23 43/47 
Average -81+19 59425 


Note: An apparent velocity of surface potential displacement of 6.4 + 1.9 cm min! 
was estimated from recordings at eP and eD. 


infiltrated with 20 jul Triton X-100 (1% v/v in water) 1 h before harvest (“TX-100 
infiltration’). For analysis, leaves were excised at the base of the petiole and 
attached so that only their laminas were bathed in deionised water (25 ml) for 
20 min at 22 °C. A control for the Triton X-100 infiltration was 20 pl Triton 
X-100 (1% v/v in water; TX-100 control), + s.d. d, Surface potential changes in 
different parts ofleaf 8 generated by current injection. Current (40 1A, 10s) was 
injected into the petiole of leaf 8 (see Fig. 2a in the main text for electrode 
placements). x/n = the number of experiments in which signal amplitudes 
exceeded —10 mV/total number of experiments. Values are means = s.d. 
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Extended Data Figure 6 | WASP generation in jasmonate biosynthesis and 
perception mutants. a, Typical recording from leaf 8 of the wild type after 
wounding the leaf tip. b, A typical recording from leaf 8 of the allene oxide 
synthase (aos) mutant after similar damage. In both cases the recording electrode 
was placed at position e3 (shown in Fig. 1a in the main text) before wounding the 
apical 40% of leaf 8. Arrowheads indicate the time of wound infliction (W). 

c, WASP amplitude (+ s.d.) in wild-type and aos plants. d, Surface potential 
changes following CI (40 1A for 10s) in the coronatine-insensitive 1-1 (coil-1) 
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mutant. Art, artefacts recorded in the leaf during CI (bar = 10s). Note that the 
signal amplitude at eP reaches a maximum before that at eD and eL. For 
electrode placements see Fig. 2a. e, Relative JAZ10 levels in wounded WT and in 
the coil-1 mutant that had been wounded or into which current (40 1A, 10s) had 
been injected. Leaves were harvested 1h after wounding or current injection. U, 
unwounded; W, wounded; CI, current injection. Significant differences from the 
unwounded wild type are indicated, *P < 0.05, ***P < 0.001 (+ s.d.). 
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ARTICLE 


a b leaf 8 after leaf 13 after 
“locus. annotation. Cl/noCl. Leaf43/control. +~Wycontrol.—S” current injection wounding leaf 8 
“FC Pyvalue FC P-value FC P-value 

“Atigi9180..+«JAZ?. «47 + «#484607 £615 £36E12 227 53604 

At1g74950 JAZ2 55 5.1E-08 19 4.6E-11 8 1.6E-06 

At3g17860 JAZ3 2.4 1.8E-06 5.9 3.9E-10 55 4.8E-06 

At1g17380 JAZ5 13:6 6.4E-09 102.8 3.6E-12 22 4.7E-06 

At1g72450 JAZ6 4.3 3.9E-08 12.8 2.7E-11 5 5.7E-06 

At2g34600 JAZ7 25.9 9.0E-12 143 3.2E-14 18.6 7.5E-04 

At1g30135 JAZ8 6.9 1.6E-07 51 1.9E-11 16.8 5.8E-06 

At1g70700 JAZ9 7 1.1E-07 24.8 2.0E-10 6.1 1.6E-05 

At5g13220 JAZ10 28.2 1.9E-10 1231 1.5E-12 28.8 8.4E-06 

At5g20900 JAZ12 1.8 1.9E-05 2.9 2.0E-08 2.5 6.2E-05 


rosette after wounding 


c 

locus annotation Cl/no Cl Leaf 13/control W/control 
“FC P-value FC P-value FC P-value 

“At2g24762_ AtGDU4 (GLUTAMINE DUMPER 4) —S-3.1—='=—‘<‘ EE OG—“‘i‘ COCO 
At1g80440 Member of the GDU (glutamine dumper) -2.8 2.6E-06 -6.9 1.3E-09 -14.1 2.7E-07 
At5g02760 Protein phosphatase 2C family protein -2.4 2.7E-02 -11.1 7.9E-06 -2.2 1.4E-04 
At5g22920 RING-type Zinc finger protein -2.3 1.6E-05 4.8 1.3E-08 -11.6 7.4E-05 
At1g12200 zinc finger (C3HC4-type RING finger) -2.3 6.9E-05 -2.1 2.4E-04 -2.2 8.2E-04 
At1g73830 BEE3 (BR ENHANCED EXPRESSION 3) -2.2 3.9E-03 -2.7 5.3E-04 -5.8 5.7E-05 
At2g44130 Galactose oxidase/kelch repeat superfamily -2.2 1.7E-08 -2.3 8.1E-09 -7.1 7.8E-06 
At2g40610 ATEXPA8 (EXPANSIN A8) -2.1 8.8E-03 -5.4 5.9E-06 -3.8 2.4E-03 
At4g30110 HMA2; cadmium-transporting ATPase -2.1 5.2E-06 -4.7 9.8E-10 -2.2 5.7E-03 
At2g15890 MEE14 (maternal effect embryo arrest 14) -2.1 1.4E-03 -3.0 3.7E-05 -4.0 1.9E-05 
At3g46130 MYB111 (MYB DOMAIN PROTEIN 111) -2.1 1.4E-07 -2.3 2.2E-08 -2.0 3.5E-03 
At1g23390 Kelch repeat-containing F-box family protein -2.0 3.0E-05 4.9 2.0E-09 -12.3 5.1E-07 
At5g51560 ~Leucine-rich repeat protein kinase family -2.0 2.6E-05 -2.5 1.2E-06 -2.2 5.3E-03 


Extended Data Figure 7 | Selected genes for which expression was altered study), and for wounded rosette leaves (‘rosette after wounding’, from ref. 32). 
upon current injection. a, List of the JAZ genes that were upregulated lhafter _ c, List of common genes that were downregulated more than twofold (P = 0.05) 
current injection (CI) into leaf 8 (this study), in leaf 13 at 1h after woundingleaf 1h after current injection into leaf 8 (this study), in leaf 13 1 h after wounding 
8 (this study), or in wounded leaves of 18-day-old plants (from ref. 32).b, Venn _ leaf 8 (leaf 13, this study), and in wounded leaves of 18-day-old plants (ref. 32), 
diagram showing downregulated (>twofold, P < 0.05) genes for current FC, fold change. 

injected leaf 8 (this study), for leaf 13 from plants wounded on leaf 8 (this 
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Extended Data Figure 8 | Effect of inhibitors and rbohD on WASP 
generation and JAZ10 expression. a-—c, Inhibitors were tested for their effects 
on WASP generation. a, Diphenyleneiodonium chloride (DPI; 50 1M in H,O 
containing 1% v/v DMSO), b, catalase (100 U ul”? in H,O) and ¢, lanthanum 
chloride (LaCl;, 2mM in H,O) were infiltrated into leaf 8 at 25-30 min before 
wounding. After wounding leaf 8, WASP amplitude and duration were 
measured on leaf 13. For controls leaf 8 was infiltrated only with carrier. 


ARTICLE 


b Control Catalase % 
0 120] (n=6) 
= -20 & 80 
sig = 
3 2 
| oO 
= -40 on 
2 az 40 
<x 
-60 
0 
Control Catalase 
-80 ie} 
50 
d JAZ10 
% (n=4) 
2 40 
2 
é& 
2 
8 30 
2 
a 
3 
@® 20 
= 
@ 
oO 
c 10 
10) 
CON DPI DPI 
Ww 
JAZ10 
(n=5) 


8 13 
WwW 


*P< 0.05 (+s.d.). d, JAZ10 transcript levels in leaf 13 following infiltration of 
DPI (50 uM in H,O containing 1% v/v DMSO) into leaf 8 followed 30 min later 
by wounding leaf 8 (+ s.e.m.). Controls (CON) were infiltrated with carrier. 
e, Similar wound-induced expression of JAZ10 in WT and rbohD plants. Plants 
(wild type or rbohD-dSpm) were wounded on leaf 8 (W). After 1 h leaves 8 and 
13 were harvested and JAZ10 expression measured by qRT-PCR (+ s.d.). 

U, unwounded; W, wounded. 
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ARTICLE 


Leaf 8 Leaf 13 
Locus annotation Stock name “Amplitude Duration — Amplitude Duration 
mV) 5 mV Ss 

z WT Col-0 -76411 163+30 5149 78420 33 
At3g04110 gir1.1 salk_057748 —- -78412 459442 -4645 63418 6 
At3g07520 gir1.4 salk_129955 — -81423 149+66 54214 8149 5 
At2g24710 glr2.3 salk_113260 — -78+20 140+12 -32411 7948 7 
At2g29120 glr2.7 salk_121990 -96412 213434 -43410 74411 5 
At2g29110 glr2.8 salk_111695 -84417 272432 -5548 92415 5 
At2g29100 glr2.9 salk_125496 -59414 219429 4344 1034234 
At2g17260 gir3.1 salk_063873  -81418 110#25 48413 1048 9 
At4g35290 gir3.2a salk_150710 -8747 3746 -46+11 3424 8 
At4g35290 glr3.2b salk_133700  -86+10 99+26 -29414 33423 8 
At1g42540 gir3.3a salk_099757 -91411 51410 -47410 36418 9 
At1g42540 gir3.3b salk_077608 -71415 3646 -40418 2345 9 
At1g05200 gir3.4 salk_079842 -75416 70+9 -3610 59+16 7 
At2g32390 gir3.5 salk_035264 -8349 94415 5816 63413 7 
At3g51480 gir3.6a salk_091801 -73412 116+18 -4446 1647 9 
At3g51480 glr3.6b salk_035353  -65412 118+10 -424+19 29416 6 
At2g46450 cngc12 _—salk_092622—_ -69+16 299431 -3644 87414 6 
At5g14870 cngc18 —sail_191_HO4 —- -8736 430414 -464+6 74412 6 
At3g17700 cngc20 _—salk_129133 -7048 >400 -40+10 88411 4 
At3g27170 cle-b salk_027349 9143 309411 42412 5546 5 
At4g35440 cle-e salk_142812 -86418 148431 -5049 4844 6 
At1g04690 kab1 salk_056819 -83+15 156425 -47410 78427 14 
At2g25600 akt6 salk_136050 -84+16 144424 -52+10 93425 6 
At3g02850 skor salk_097435 = -71416 209456 47412 74420 7 
At5g55630 tpk1 salk 146903 81411 125+12 -33412 45435 8 
At4g18160 tpk3 salk_049137  —_-82414 117413 51412 4849 9 
At5g10220 annat6 salk_043207.  -81412 146421 -30412 64421 7 
At1g59870 pen3 salk_000578 = -91410 147425 53412 8048 7 
At4g11150 vha-e1 salk_019365 -8147 >400 4244 81415 5 
At3g01390 vha-g salk_087613  -87410 232476 -7149 87418 5 
At2g21410 vha-a2 salk_142642 -7949 177417 5349 118419 7 
At2g18960 ahat salk_118350 -63+20 109+31 -36419 70+28 9 
At4g30190 aha2 salk_073730 -76+19 118438 -4649 78435 8 
At5g57350 aha3 sail_810_08 71417 127438 58410 79419 7 
At5g62670 ahat1 salk_152723.—-77415 101433 5314 87420 9 
At5g47910 rbohD salk_070610  -92+92 9836 53412 61413 9 
At5g47910 rbohD dSpm -84420 104+14 -504£23 70#19 7 
proton ahataha3 rg ae 79413 202247 41411 1264297 
pose teen aha2aha3 re ta -82410 128433 44415 72430 8 
oe gir3.3agir3.6a es eed 55418 8,541.7 0+0 0+0 11 
oe gir3.3bglr3.6a pats fe 51416 9.341.3 020 0+0 7 

Extended Data Figure 9 | Characterization of wound-activated surface position e3). All measurements for leaf 13 were from electrodes placed on the 


potential changes (WASPs) in homozygous T-DNA insertion lines. Leaf 8 petiole 1 cm from the centre of the rosette (position e3’ in Extended Data Fig. 
was wounded and the surface potential was monitored in leaf 8 and distal leaf 2c). n, number of experiments. Values are means + s.d. Mutants displaying 
13. For leaf 8, an electrode was placed 3 cm from the leaf apex wound (Fig. 1a, | WASP durations of <60s in leaf 8 or <40s in leaf 13 are highlighted. 
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type and in T-DNA insertion lines. a, Level of GLR3.1 transcripts in gir3.la Significant differences to the wild type are indicated, *P < 0.05, **P < 0.01, 
(Salk_063873). b, Level of GLR3.2 transcripts in glr3.2a (Salk_150710) and ***P < 0.001 (+ s.d.). e, RT-PCR analyses of the expression pattern of GLR3.3 
gir3.2b (Salk_133700). c, Level of GLR3.3 transcripts in glr3.3a (Salk_099757), and GLR3.6 genes in glr3.3a glr3.6a and glr3.3b glr3.6a double mutants. UBC21 
gir3.3b (Salk_077608) and double mutant gir3.3a glr3.6a. d, Level of GLR3.6 was the reference transcript. 

transcripts in gir3.6a (Salk_091801), gir3.6b (Salk_035353) and double mutant 
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An observational correlation between stellar 
brightness variations and surface gravity 


Fabienne A. Bastien!, Keivan G. Stassun!?, Gibor Basri® & Joshua Pepper! 


Surface gravity is a basic stellar property, but it is difficult to mea- 
sure accurately, with typical uncertainties of 25 to 50 per cent if 
measured spectroscopically’” and 90 to 150 per cent if measured 
photometrically’. Asteroseismology measures gravity with an uncer- 
tainty of about 2 per cent but is restricted to relatively small samples 
of bright stars, most of which are giants**. The availability of high- 
precision measurements of brightness variations for more than 
150,000 stars’*® provides an opportunity to investigate whether the 
variations can be used to determine surface gravities. The Fourier 
power of granulation on a star’s surface correlates physically with 
surface gravity’”®: if brightness variations on timescales of hours 
arise from granulation"’, then such variations should correlate with 
surface gravity. Here we report an analysis of archival data that 
reveals an observational correlation between surface gravity and root 
mean squared brightness variations on timescales of less than eight 
hours for stars with temperatures of 4,500 to 6,750 kelvin, log surface 
gravities of 2.5 to 4.5 (cgs units) and overall brightness variations of 
less than three parts per thousand. A straightforward observation of 
optical brightness variations therefore allows a determination of the 
surface gravity with a precision of better than 25 per cent for inactive 
Sun-like stars at main-sequence to giant stages of evolution. 
Brightness variations of Sun-like stars are driven by many factors, 
including granulation”, oscillations", rotation and magnetic activity”. 
As they evolve from high-surface-gravity (high-g) dwarfs to low-g 
giants, their convective zones deepen, they rotate more slowly, their 
magnetic activity diminishes, and their oscillation and granulation 
timescales increase, all of which change the nature of the brightness 
variations. It has been previously demonstrated that the power in 
granulation (as traced by the Fourier spectrum of the brightness vari- 
ations) is inversely proportional to Vmax, the peak frequency of Sun-like 
acoustic oscillations””*. Given that Vmax is itself proportional to g (ref. 11), 
it follows that g should manifest in brightness variations on timescales 
that trace granulation. Although physically we expect this, it is not 
immediately apparent that brightness variations can be used as an 
effective determinant of g because other phenomena not directly related 
to g—most importantly spots, plage and other sources of brightness 
variations driven by the star’s magnetic activity—probably dominate 
the observed brightness variations. It is therefore necessary to filter out 
the brightness variations arising from these phenomena, which occur 
on timescales of hours to days, while preserving the brightness varia- 
tions related to granulation and g on timescales of minutes to hours. 
Using long-cadence (30 min) light curves from Quarter 9 of NASA’s 
Kepler Mission”, and representing them using the Filtergraph data 
visualization tool’*, we observe clear patterns in the evolutionary pro- 
perties of stars encoded in three simple measures of their brightness 
variations® (Fig. 1): range (R,,,), number of zero crossings (Xo), and 
root mean square on timescales shorter than 8 h (to which we will refer 
as ‘8-hr flicker’ or Fg). Relating these measures to g determined aster- 
oseismically from a sample of Kepler stars’, we find distinctive features 
that highlight the way stars evolve in this three-dimensional space, making 


up an evolutionary diagram of photometric variability. Within this 
diagram, we find a vertical cloud of points, largely made up of high-g 
dwarfs, that have large R,,,, small Xp and low Fg values. We observe a 
tight sequence of stars—a ‘flicker-floor’ sequence that defines a promi- 
nently protruding lower envelope in Rya;—spanning gravities from 
dwarfs to giants. Sun-like stars of all evolutionary states evidently move 
onto this sequence only when they have a large Xo, which in turn 
implies low stellar activity. 

We find that g is encoded in Fg, yielding a tight correlation between 
the two (Fig. 2). Moreover, using 11 yr of SOHO Virgo'*”’ light curves 
of the Sun and sampling them at the same cadence as the Kepler long- 
cadence light curves, we find that the Sun’s (constant) g is also mea- 
surable using Fg, which remains invariant throughout the 11-yr solar 
activity cycle even while the Sun’s R,,, and Xp values change consider- 
ably from the spot-dominated solar maximum to the nearly spotless 
solar minimum (Fig. 1). From the Sun’s behaviour, we infer that a large 
portion of the Kepler stars’ vertical scatter within the vertical cloud at 
the left of the diagram (Fig. 1) may be driven by solar-type cyclic acti- 
vity variations. Most importantly, the Sun’s true g fits our empirical 
relation (Fig. 2), and the g value of any Sun-like Kepler star from dwarf 
to giant may be inferred from this relation with an accuracy of 0.06- 
0.10 dex (Supplementary Information). 

Asteroseismic analyses derive g from the properties of stellar acous- 
tic oscillations*’**°. Given that near-surface convection drives both 
these oscillations and granulation, and given the brightness variability 
timescales to which Fg is sensitive, we suggest that a combination of 
different types of granulation (with typical solar timescales ranging 
from ~30 min to ~30h; ref. 21) drives the manifestation of g in this 
metric. The precise timescales of these phenomena in solar-type stars 
depend strongly on the stellar evolutionary state and, hence, also on g 
(refs 5,9,10,22). Acoustic oscillations, whose amplitudes are sensitive 
to g (ref. 5), may provide an increasingly important contribution to Fg 
as stars evolve into subgiants and giants and the amplitudes and time- 
scales of these oscillations increase**"’. At some point, the pressure- 
mode and granulation timescales cross’, which may lead to a breakdown 
of our Fs-g relation at very low values of g. 

By using Fg to measure g, we can construct a photometric variability 
evolutionary diagram for most stars observed by Kepler, even for stars 
well beyond the reach of asteroseismic and spectroscopic analysis 
(Fig. 3). By coding this diagram according to stellar temperature and 
rotation period, we may trace the physical evolution of Sun-like stars as 
follows: stars begin as main-sequence dwarfs with large photometric 
Ryar Values and small Xo values, presumably driven by simple rota- 
tional modulation of spots at relatively short rotation periods. As the 
stars spin down to longer rotation periods, their brightness variations 
first become steadily ‘quieter’ (systematically lower Rya,) but then 
become suddenly and substantially more complex (larger Xg) as they 
reach the flicker floor. Some stars reach the floor only after beginning 
their evolution as low-g subgiants, having moved to the right (higher 
Fs) as their effective temperatures begin rapidly dropping. Other stars 
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Figure 1 | Simple measures of brightness variations reveal a fundamental 
‘flicker sequence’ of stellar evolution. We establish the evolutionary states of 
stars with three simple measures of brightness variations*. The abscissa, 8-h 
flicker (Fg), measures brightness variations on time scales of 8h or less. The 
ordinate, R,q,, yields the largest amplitude of the photometric variations in a 
90-d timeframe. The number of zero crossings, Xp (symbol size; ranging from 
0.01 to 2.1 crossings per day), conveys the large-scale complexity of the light 
curve. We correct both R,, and Fg for their dependence on Kepler magnitude 
(Kepmag). Colour represents asteroseismically determined g. We observe two 
populations of stars: a vertical cloud composed of high-g dwarfs and some 
subgiants, and a tight sequence—the flicker floor—spanning a range of g from 
dwarfs to giants. The typically large R,., values of stars in the cloud, coupled 
with their simpler light curves (small X), implies brightness variations driven 
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Figure 2 | Stellar surface gravity manifests in a simple measure of brightness 
variations. The same stars from Fig. 1 with Kepler Quarter 9 data. 
Asteroseismically determined’ g shows a tight correlation with Fs. Colour 
represents the Ry, of the stars’ brightness variations; outliers tend to have large 
brightness variations. Excluding these outliers, a cubic-polynomial fit through 
the Kepler stars and through the Sun (large star symbol) shows a median 
absolute deviation of 0.06 dex and a root mean squared deviation of 0.10 dex 
(Supplementary Information). To simulate how the solar g would appear in the 
archival data we use to measure g for other stars, we divide the solar data into 
90-d ‘quarters’. Our Fg—g relation measured over multiple quarters then yields a 
median solar g of 4.442 with a median absolute deviation of 0.005 dex anda root 
mean squared error of 0.009 dex (the true solar g is 4.438). 
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by rotational modulation of spots. In contrast, large Xp values characterize stars 
on the sequence. The Fs values of stars in this sequence increase inversely with g 
because the physical source of Fs is sensitive to g. Ryar also increases with Fg 
along the floor, because Fs is a primary contributor to Ra, (as opposed to 
starspots above the floor). Stars with a given F, value cannot have R,, less than 
that implied by F; itself: quiet stars accumulate on the flicker floor because they 
are prevented from going below it by the statistical definition of the two 
quantities. Stars above the floor have larger amplitude variations on longer 
timescales that set Ryar. The large star symbol with vertical bars and the inset 
show the Sun’s behaviour over the course of its 11-yr magnetic cycle. The Sun’s 
Fy value is largely invariant over the course of its cycle, just as its g value is 
invariant. p.p.t., parts per thousand. 


join the sequence while still dwarfs; these are easily identified in 
our diagram by their drastically increased Xo values at very low Fs. 
Evidently some dwarf stars become magnetically quiet while still 
firmly on the main sequence, whereas others do not reach the floor 
until they begin to swell considerably. We note that the Sun seems to 
approach the flicker floor at solar minimum; its R,a, value becomes 
quite low and its Xo value strongly increases (Fig. 1). 

A star’s main-sequence mass and initial spin probably determine 
where along the flicker-floor sequence it ultimately arrives, because the 
slope of a star’s trajectory in our diagram is essentially determined by 
the ratio of its spin-down timescale (downward motion) and structural 
evolutionary timescale (rightward motion). Regardless, once on the 
floor all stars evolve along this sequence and stay on it as they move up 
to the red-giant branch, their effective temperatures steadily dropping 
as their surfaces rapidly expand. Despite their very slow rotation as 
subgiants and giants on the flicker-floor sequence, their photometric 
Ryar is steadily driven upwards by the increasing Fs, which reflects the 
stars’ continually decreasing g. The increasing Rya, and Fg values of 
subgiants and giants on the flicker floor is probably the result of the 
increasingly important contribution of radial and non-radial pulsa- 
tions to the overall brightness variations”**. 

A few stars appear as outliers to the basic picture we have presented 
here; these are seen towards the right of the vertical cloud of points in 
our evolutionary diagram (Fig. 3). Some active dwarfs have higher Fg 
values than expected for their g values. Frequent strong flares can boost 
Fg, as currently defined, and some hotter dwarfs are pulsators with 
enough power near 8h to increase their Fg values. A few such cases 
appear also in the asteroseismic sample (Fig. 1). Some lower-g stars have 
Ryar Values above the flicker floor owing to the presence of magnetic 


©2013 Macmillan Publishers Limited. All rights reserved 


a log,gig (cm s)] 
4.45 3.96 3.56 3.28 3.06 2.88 2.72 
1.5 6,500 
con 
2 
sty m 
8 6,000 x 
D 0.5+--24 = 
: 5 
$ 3 
< OL 5,500 3 
a g 
a 2 
5 Cc 
& 0.5}. o 
5 pee 
g a 
ac 5,000 & 
[=} 
o 7 T 
2 
1.5 i i i 4,500 
0.05 0.1 0.15 0.2 0.25 0.3 
F, (p.p.t., Kepmag corrected) 
b 2 
log,oig9 (cm s*)] 
4.45 3.96 3.56 3.28 3.06 2.88 2.72 
15h. Y Eeeeaee pafeeees | ee eee 1.6 
coy 
eo) 
8 1f- 1.4 
a) 
(3) — 
D> 0.57 g 
E 1.2e_ 
me} 
2 of g 
per rom 
a 1s 
& -0.5}- MnO SI eben = 
8 
S | MF 0.8 
Do 
2 
-1.5 0.6 


0.15 0.2 0.25 0.3 


0.05 0.1 
F, (p.p.t., Kepmag corrected) 


Figure 3 | An integrative view of stellar evolution in a new diagram of 
brightness variations. Same as Fig. 1, but for Kepler stars lacking asteroseismic 
g. We include a g scale at the top (from conversion of the Fs scale at bottom via 
our calibrated relationship). Here we selected stars with Kepler magnitudes 
between 11.0 and 11.85 to limit the sample to ~1,000 stars for visual clarity 
(1,012 points are shown). We removed objects that are potentially blended 
(Kepler flux contamination greater than 0.05) as well as those that may be 
galaxies (Kepler star/galaxy flag other than 0). Arrows schematically indicate 
the evolutionary paths of Sun-like stars in this diagram. Stars generally move 
from top to bottom, as the overall brightness fluctuations due to spots decrease 
with time, and then from left to right as their g values decrease. All stars 
eventually arrive on the flicker-floor sequence and evolve along it. a, Colour 
represents effective temperature. Stars cool as they evolve from left to right, 
from dwarfs to red giants. We restricted the effective temperatures to be 4,500- 
6,650 K, using the revised temperature scale for Kepler stars*’. b, Same as a, but 
colour-coded by the dominant periodicity in the light curve. We limited the 
sample to stars with dominant periods longer than 3 d (to eliminate very rapidly 
rotating active stars) and shorter than 45 d (half the Kepler 90-d data interval). 
This period traces rotation for unevolved stars and pulsations for evolved ones. 
Dwarfs generally show the expected spin-down sequence with decreasing Ryar 
(correlated with the level of surface magnetic activity). Subgiants and giants 
broadly display very slow rotation, as expected. 


activity*’, slow radial pulsations or secular drifts. Finally, a few outliers 
are simply due to data anomalies. As our technique is refined, these 
exceptions should be treated carefully before assigning a F,-based g 
value, particularly for high-F, stars for which Rya: is greater than ~3 
parts per thousand. They constitute a small fraction of the bulk sample, 
and most of them can be identified as one of the above cases. 
Common to all of the stars along the flicker floor is the virtual 
absence of spot activity as compared with their higher-R, counter- 
parts; short-timescale phenomena such as granulation and oscillations 
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dominate the brightness variations. Given that spots probably sup- 
press acoustic oscillations in the Sun and other dwarf stars****’, the 
large Xo values of stars along this sequence may partly reflect the ability 
of short-timescale processes to manifest more strongly now that large 
spots no longer impede them, along with the increasing complexity of 
the convective variations. As the stars evolve into full-fledged red 
giants and beyond, the principal periodicity in their brightness variations 
increasingly reflects shorter-period oscillations, as opposed to their 
inherently long-period rotation, because oscillations become dominant 
over magnetic spots. 

It may be possible to differentiate between stars with similar g values 
but different internal structures (for example first-ascent red giants 
versus helium-burning giants) through application of a sliding time- 
scale of Fg as a function of g, where the sliding timescale would capture 
the changing physical granulation timescales with evolutionary state”. 
Moreover, the behaviour of stars on the flicker floor may explain the 


source of radial velocity ‘jitter’ that now hampers planet detection 


through radial velocity measurements”. 
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Measurement of a solid-state triple point at the 
metal-insulator transition in VO, 


Jae Hyung Park", Jim M. Coy’, T. Serkan Kasirga', Chunming Huang", Zaiyao Fei’, Scott Hunter! & David H. Cobden! 


First-order phase transitions in solids are notoriously challenging 
to study. The combination of change in unit cell shape, long range 
of elastic distortion and flow of latent heat leads to large energy 
barriers resulting in domain structure, hysteresis and cracking. 
The situation is worse near a triple point, where more than two 
phases are involved. The well-known metal-insulator transition in 
vanadium dioxide’, a popular candidate for ultrafast optical and 
electrical switching applications, is a case in point. Even though 
VO, is one of the simplest strongly correlated materials, experi- 
mental difficulties posed by the first-order nature of the metal- 
insulator transition as well as the involvement of at least two 
competing insulating phases have led to persistent controversy 
about its nature’*. Here we show that studying single-crystal 
VO, nanobeams* “ in a purpose-built nanomechanical strain appa- 
ratus allows investigation of this prototypical phase transition with 
unprecedented control and precision. Our results include the strik- 
ing finding that the triple point of the metallic phase and two insu- 
lating phases is at the transition temperature, T,, = T., which we 
determine to be 65.0 + 0.1 °C. The findings have profound implica- 
tions for the mechanism of the metal-insulator transition in VO,, 
but they also demonstrate the importance of this approach for 
mastering phase transitions in many other strongly correlated 
materials, such as manganites’’ and iron-based superconductors". 

The metal-insulator transition (MIT) in VO, is accompanied by a 
large and rapid change in the conductivity and optical properties, with 
potential uses in switching and sensing. VO, has recently received 
renewed attention as a convenient strongly correlated material for 
the application of new ultrafast'?* and microscopy”’ techniques, 
ionic gating and improved computational approaches**. However, 
the problems associated with bulk or film samples that consist ofa com- 
plex of multiple solid phases and domains under highly non-uniform 
strain, as well as compositional variations such as oxygen vacancies” 
and hydrogen doping”, make it almost impossible to disentangle the 
underlying parameters on which rigorous understanding can be built. 
The experiments described here eliminate these problems, allowing 
unprecedented control of the MIT and accurate determination of the 
underlying phase stability diagram of pure VOp. 

Figure 1a illustrates the structures of the phases involved in the MIT. 
In every phase there are two interpenetrating sets of parallel chains of 
vanadium atoms each surrounded by six oxygen atoms forming a 
distorted octahedron (the oxygen atoms are not shown). In the high- 
temperature metallic (rutile, R) phase all the chains are straight and 
periodic, whereas in the low-temperature insulating (monoclinic M1) 
phase every chain is dimerized. There are also two other known insu- 
lating phases: monoclinic M2, in which only one set of chains is dimer- 
ized, and triclinic T, which is intermediate between M1 and M2. The 
existence of both M1 and M2, with similar dielectric properties yet 
different magnetic properties, provides constraints on the theory of the 
MIT; for example, it rules out a purely Peierls-type mechanism”. In the 
older literature the MIT is taken to occur between R and M1, although 
recent studies* 1°? have shown that M2 domains occur in most VO 
samples near the MIT, raising the question of its role in the transition. 
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Figure 1 | Control of the metal-insulator transition in VO, using uniaxial 
stress. a, Arrangement of vanadium ions in the phases involved in the MIT, 
indicating their different vanadium chain periods and dimerization (yellow). 
b, Expected layout of the stress-temperature phase diagram near the MIT, 
showing the transition temperature T, at zero stress. c, Experimental geometry, 
showing an electron micrograph (right) of a VO, nanobeam suspended across a 
slot of width L in a silicon chip (left, optical micrograph) whose width is 
controlled by pushing on the paddle and measured by deflection of a laser 
beam. The yellow lines signify gold wire bonds. d, Series of optical images 
showing movement of the R-M1 and M1-M2 interfaces as L is increased in 
roughly 100-nm steps at 64 °C (device P7, 40 tum gap). 


The largest difference in unit cell shape between R, M1 and M2 
is along the pseudo-rutile caxis (the vanadium chain axis), with 
Cr = 5.700 A, Cy, = 5.755 A and Cyy> = 5.797 A, as indicated in Fig. la. 
Compressive strain along this axis in an epitaxial film can lower the 
transition to room temperature”; thus, applying uniaxial tensile 
stress P, along it can be used to control the transition’*’. A stability 
diagram in the P.-T plane (with all other stress components zero) is 
expected to have the layout indicated in Fig. 1b. A shaded region 
indicates where the T phase occurs’”’. The effect of P. on the phase 
stability (Fig. 1b) resembles that of stress along the [110]g axis” and of 
doping”* by chromium. Rough ideas of the locations of the three phase 
boundaries have been obtained by modelling bent nanobeams'"®. The 
triple point (T;,, P,,) has not been located, although M1 and M2 are 
known to be very close in free energy near the transition’’. The stress P,, 
is normally taken to be positive, implying that a perfect unstrained crystal 
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shows a direct transition from M1 to Rat T.. We find that this is not in 
fact true, and T,, is identical to T- to within +0.05 °C, or one part in 10* 
in absolute temperature. We further determine T, to be 65.0 + 0.1 °C. 
In addition we present evidence that in the neighbourhood of T, the 
M1 phase can distort continuously under tension into the metastable 
T phase. These discoveries have deep implications for the physics of the 
MIT, for the interpretation of many measurements on VO) crystals 
and films, and for mastering the transition with a view to applications. 
Our investigations of the MIT rest on the ability to precisely control 
the length of a suspended single-domain nanobeam and thereby to 
apply pure uniaxial stress along it, a situation that cannot be achieved 
in larger crystals because of domain structure. The elements of the 
experiment are illustrated in Fig. 1c (see Methods). A VO2 nanobeam 
is fixed, in some cases with electrical contacts, across a micromachined 
slot in a silicon chip whose width L can be varied with nanometre 
precision. We perform measurements only when the nanobeams are 
straight, so the maximum compressive stress is limited by buckling. By 
varying L and T, the three phases R, M1 and M2 can be induced and can 
be differentiated by reflection contrast with linearly polarized light’®, as 
illustrated in Fig. 1d, as well as by Raman spectroscopy'* and mea- 
surements of electrical resistance. Linearly polarized light also reveals 
twinning", allowing us to select devices in which twinning is absent. 
According to the phase diagram in Fig. 1b the state of the nanobeam 
as a function of L and T should include regions of two-phase coexist- 
ence as sketched in Fig. 2a. We find that the suspended part of the 
nanobeam can indeed be brought into coexistence between any pair of 
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Figure 2 | Temperature and length dependence in coexistence. a, Expected 
configuration of a nanobeam as a function of T and L. b, Variation of 
interface positions with T at fixed L corresponding to moving along the lines in 
the insets (upper: device P11, 40 jum gap; lower: P9, 20 |tm gap). Each interface 
type is indicated by a colour. c, Histograms of temperatures at which 
reconfigurations occur for 20 cycles sweeping at 0.1 °C min’ (device P14, 

40 jum). d, Sequence of images during reconfiguration from M2 + R to 

M2 + M1 ina nanobeam at the triple point, 65.0 °C (device P8B, 20 jum). 

e, Variation of interface positions with L at fixed T, corresponding to moving 
along the vertical lines in the inset (device P14). The fractional differences in 
lattice constants, «;;, are the inverse slopes of these lines. 
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the three phases. The position of the interface changes smoothly and 
reproducibly with both L and T in between sudden reconfigurations. 
For the case of M2 + R coexistence we define the interface position 
Ywor as the shift relative to an initial position such that it increases as 
R converts to M2. We define yyyam1 and yoir similarly. 

The MIT in VO, is usually studied as a function of T, without paying 
close attention to strain or to interconversion between M1 and M2. In 
undoped samples it is seen in the range 65-68 °C, with a hysteresis of 
several degrees Celsius, and the value of T. is not known more precisely 
than this. In our experiments on nanobeams, as T is varied at fixed L we 
see the behaviour shown in Fig. 2b, which can be understood with 
reference to the colour-coded lines in the inset phase diagrams. If we 
start in M2 + M1 coexistence (Fig. 2b, upper panel, green) and in- 
crease T, the interface position yy. first moves smoothly as the 
stress required for phase equilibrium changes’*. Then at a temperature 
Tyisr there is a sudden reconfiguration to M2 +R coexistence 
(Fig. 2b, upper panel, red) after which the interface position yyor 
moves smoothly again. On cooling, the reverse reconfiguration occurs 
at temperature Tp_,1. Starting instead at a smaller length, in M1 + R 
coexistence (Fig. 2b, lower panel, blue), a jump to M2 + R coexistence 
(again red) occurs at Ty4_,2,. Whereas the reverse occurs at Ty. 51. 
Histograms of the reconfiguration temperatures on repeated cycling at 
0.1°C min are shown in Fig. 2c. For this device Ty1.p and Ty1-+2 
are narrowly peaked at 66.4 and 65.3 °C, respectively; for other devices 
different values are found. This can be explained by superheating of 
M1, which varies between devices because the ease of nucleation of the 
high-temperature phase (R or M2) depends on microscopic details. 

In contrast, Tp. and Ty. are both peaked at the same tem- 
perature, 65.0 °C, indicated by the dotted line in Fig. 2c. In several 
nanobeams of different sizes, grown on different occasions, these 
two temperatures always lay in the narrow range between 64.9 and 
65.2 °C; moreover, neither storage in air for 6 months nor heating 
to 200°C for 1h changed them, indicating that effects of oxygen 
vacancies” and hydrogen doping” were minimal. This observation 
can be explained as follows. A small amount of M1 is often visible at 
the interface in M2 + R coexistence, probably because it reduces the 
elastic energy. On cooling there is therefore no need for nucleation of 
M1, and reconfiguration occurs as soon as the triple point is reached. 
In fact, the dynamics of this process can sometimes be observed. 
Figure 2d shows a sequence of images taken in less than a second 
during the reconfiguration of a nanobeam after bringing it slowly 
down to 65.0°C in M2 + R coexistence. A small pre-existing wedge 
of M1 at the M2 + R interface rapidly expands to replace the R part of 
the nanobeam completely. All the above observations thus suggest that 
the triple point is between 64.9 and 65.2 °C. 

We now consider varying L at fixed T. First, in coexistence between 
any pair of phases the interface position is linear in L, as shown in 
Fig. 2e. This follows from the fact that the interface moves so as to 
maintain P. at the phase equilibrium value. A length increase SL 
causes an interface shift dy)4;p, which changes the natural length by 
SL to keep the strain constant. This implies that 6L = ayirdymir 
where vir = Cmi/Ccr — 1. Hence yyir should vary according to dL/ 
dymir = OM1R and similarly dL/dyyomni = %u2mM1 = CM2/Cui 1 and 
dL/dyyor = % vor = Cv2/CR — 1 © &yom1 + &mir. Best linear fits to the 
data shown give o&2m1 = 0.0074, oir = 0.0100 and oor = 0.0174, 
close to the values of 0.0073, 0.0098 and 0.0172 calculated from the 
known lattice constants**”’. 

The ability to control L allows us to confirm the temperature of 
the triple point and to determine the behaviour very close to it. We 
exploit the fact that the electrical resistance of the nanobeam, R,, is 
sensitive to the phase composition because each phase has a different 
resistivity'*'’. The measurements in Fig. 3 are for a device (P10) with 
indium contacts. Figure 3a shows that at 65.3 °C R,, changes smoothly 
with L, asa result ofa smoothly changing M2 + R interface position for 
T > T,, (see inset, red line). In contrast, at 63.9 °C it changes in a more 
complicated way, reflecting the sequence M1 + R>M1—M2+ M1 
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Figure 3 | Resistance-length measurements. a, At 65.3 °C, above the triple 
point (red line in inset), the resistance varies steadily in M2 + R coexistence. At 
63.9 °C, below the triple point (blue line in inset), it reflects a sequence of 
transitions (device P10; L is varied at 8nms '). b, Starting in the R state, M2 
nucleates if T= 65.10 °C (red), whereas M1 nucleates if T= 64.95 °C (blue), 
implying that these lie on either side of T,, (see inset). c, The variation of the 
resistance with L and Tis due toa strain-dependent activation energy in the M1 
phase (dotted lines, offset by —0.15 MQ for clarity) and to conversion of M1 
to M2 in coexistence (dashed line). Grey lines indicate an additional resistance 
rise attributed to the T phase. 


expected for T < T,, (see Fig. 3a inset, blue line). Jumps and hysteresis 
here show that M1 and M2 both require nucleation, which is consistent 
with the transitions being first order. To establish T,, we measured R, 
at a series of closely spaced temperatures, each time preparing the 
nanobeam in a fully metallic R state by cooling at sufficiently small 
L for R to be stabilized by compression, and then increasing L until an 
insulating domain nucleated. At 64.95 °C and below, the domain that 
appeared was always M1, whereas at 65.10 °C and above it was always 
M2, implying that T,, was between these two values (see Fig. 3b). This 


As measured 


Broken 0.6 


Compressed 
at 65.05 °C 


o 
BR 


Retracted 


Cooled 
to 64.95 °C 


Tensile stress P, (GPa) 
3: 
ive} 


LETTER 


is perfectly consistent with the range of T,, deduced above from the 
T-sweeping measurements. Including uncertainties from variation 
between samples, temperature fluctuations and calibration, we con- 
clude that T,, = 65.0 + 0.1 °C. 

Measurements of resistance versus length also yield other useful 
information, as illustrated in Fig. 3c (see Supplementary Information 
for details). First, the variation of the resistance of the M1 state with L 
and Tis explained by a linear increase in the activation energy of the resis- 
tance with tensile strain 7 = (L — Ly)/Lo, Lo being the effective natural 
length. The dotted lines are plots of R, « exp[—(Ao + yi7)/kgT] using 
coefficient values 49 = 0.31 eV and y = 0.77 eV (the uncertainty in y is 
10%), where kg is Boltzmann’s constant. Second, from the variation of 
R, in M1 + M2 coexistence (such as that indicated by the dashed line) 
we can deduce that py4o/ py, = 2.3 + 0.2 and that the activation ener- 
gies of M1 and M2 are the same to within a few per cent. Third, a 
distinct additional increase in R,, indicated by the solid grey lines, 
precedes the nucleation of M2 from M1. This can be explained by a 
continuous distortion of M1 into the T phase, which we immediately 
infer has a higher resistivity than M1 and is unstable relative to M2 at 
all temperatures from T,, to below 26 °C. 

Although we cannot measure the axial stress P. directly, we can 
realize the condition P- = 0 simply by breaking a nanobeam with a 
micromanipulator after other measurements have been completed. 
This produces opposing cantilevers, as illustrated in Fig. 4a. If the 
cantilevers are prepared in the fully M1 state by warming from lower 
temperature to around T, and are then brought together, the com- 
pression produces a domain of R phase in one of them. After retrac- 
tion, this domain persists only above a certain temperature, and 
shrinks and disappears below it. We identify this temperature with 
T,, the transition temperature at zero stress. By performing the pro- 
cedure on several devices we obtained the striking result that in every 
nanobeam T, was equal to T;,, to within an uncertainty of ST ~ 0.05 °C 
governed by temperature fluctuations. We thus conclude that 
T. = Ty = 65.0 £ 0.1 °C. 

Figure 4b shows the phase diagram of VO) inferred from measure- 
ments on ten nanobeams (see Supplementary Information for details). 
In brief, the PAT) |i were deduced from measurements of yj (i,j = 
M1, M2, R) versus T as follows. Because the stress in coexistence must 
take the phase equilibrium value, consideration of the variation of the 
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Figure 4 | Phase diagram of VO}. a, The transition temperature T, at zero 
stress is measured by finding the temperature above which the metallic phase 
(darker) becomes stable in a cantilever, as illustrated here (device P8). It is 
found to be equal to the triple point temperature: T. = T,, = 65.0 + 0.1 °C. 


Temperature (°C) 


b, Deduced stress—temperature phase diagram. The small black filled circles are 
for the superheated M1 phase. The grey shaded strip is where a metastable T 
phase can occur. c, The results imply that the free energies of all the phases are 
degenerate at T. in unstrained pure VOo. 
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strain 1 = P./E with T (E is Young’s modulus, taken to be 140 GPa for 
every phase’®) yields’’ 
1 OP, 
E 0T 


dy 
~ dT 


AK (1) 


ij ij 
The first term on the right is the change due to movement of the 
interface. The second, AK, is the thermal expansion mismatch between 
nanobeam and silicon substrate, which produces a correction of 
5-10%. Given that 7(T,,) =0, equation (1) can be used to derive 
n(T) for each boundary. The deduced phase boundaries are straight, 
with uncertainties in their slopes of 5-10%, and obey the constraint at 
the triple point 

dy 


+ OMAR Gop 


dy 
=OmM2R 7, 


ar (2) 


ul 
O%M2M1 dT 


M2M1 MIR M2R 


which is imposed by the Clausius—Clapeyron relations 
OP. S—-Si  _ Sj Si 3) 
oT ij b?(a;—aj) aiiV 


in combination with equation (1). Here S; is the entropy per vana- 
dium pair in phase i, b = 4.55 A is the base length of the rutile unit cell, 


and V=59A° is the rutile unit cell volume. The value of OP,/ 
0T|min=71MPa°C' corresponds to the known latent heat*? of 
1,020 cal per mole formula unit; 0P./OT| mor = 29 MPa °C! corre- 
sponds to 710calmol'; and 6P./6T|\2m1 = —29MPa°C'. From 
the results we deduce entropy differences Sp — Sy = (3.0 + 0.3)kp 
and Sp — Sy = (2.1 + 0.1)ky. The equality of T,, and T. to within 
ST ~ 0.05 °C implies that the strain 7,, at the triple point is smaller 
than 8Tdy/dT|y\or = 1.0 X 10°, where dy/dT|Mor = 2.0 X 1074 °C™|, 
and this is also indicated on the phase diagram. Finally, the finding that 
the T phase is metastable with respect to M2 is indicated by a grey 
shaded strip within the M2 stability region. 

To stress the implication of these results we sketch in Fig. 4c the T 
dependence of the Gibbs free energies G; of the phases of unstrained 
VO,, setting Gp = 0. The slopes are the entropies S; = —dG;,/dT at zero 
stress. Precisely at the MIT the insulating M1 and M2 phases are 
simultaneously degenerate with the metallic R phase. This and other 
facts revealed by our measurements are not explained by current models 
of the transition, but will be crucial ingredients of the correct theory. 
For example, further development and application of the Landau 
theory’® of VO, should be prompted by our results. The insights we 
have gained into this important solid-state phase transition will be 
critical for both understanding and mastering the MIT in VOp. 


METHODS SUMMARY 


VO, nanobeams grown by physical vapour transport were transferred onto slots 
on the micromachined silicon chips by using a micromanipulator and bonded 
with ultraviolet-curable epoxy (see Supplementary Information). Measurements 
from ten devices were used, and the temperature was calibrated with the known 
melting points of gallium and potassium (Supplementary Information). The slot 
width L (20 or 40 pm) was varied piezoelectrically on a temperature stage under an 
optical microscope (Supplementary Information). 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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The role of spin in the kinetic control of 
recombination in organic photovoltaics 
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David S. Ginger? & Richard H. Friend! 


In biological complexes, cascade structures promote the spatial 
separation of photogenerated electrons and holes, preventing their 
recombination’. In contrast, the photogenerated excitons in organic 
photovoltaic cells are dissociated at a single donor-acceptor hetero- 
junction formed within a de-mixed blend of the donor and acceptor 
semiconductors’. The nanoscale morphology and high charge densi- 
ties give a high rate of electron-hole encounters, which should in 
principle result in the formation of spin-triplet excitons, as in organic 
light-emitting diodes*. Although organic photovoltaic cells would 
have poor quantum efficiencies if every encounter led to recombina- 
tion, state-of-the-art examples nevertheless demonstrate near-unity 
quantum efficiency*. Here we show that this suppression of recom- 
bination arises through the interplay between spin, energetics and 
delocalization of electronic excitations in organic semiconductors. 
We use time-resolved spectroscopy to study a series of model high- 
efficiency polymer-fullerene systems in which the lowest-energy 
molecular triplet exciton (T,) for the polymer is lower in energy than 
the intermolecular charge transfer state. We observe the formation 
of T, states following bimolecular recombination, indicating that 
encounters of spin-uncorrelated electrons and holes generate charge 
transfer states with both spin-singlet ('CT) and spin-triplet CT) 
characters. We show that the formation of triplet excitons can be 
the main loss mechanism in organic photovoltaic cells. But we also 
find that, even when energetically favoured, the relaxation of 3CT 
states to T, states can be strongly suppressed by wavefunction delocal- 
ization, allowing for the dissociation of 3CT states back to free charges, 
thereby reducing recombination and enhancing device perform- 
ance. Our results point towards new design rules both for photo- 
conversion systems, enabling the suppression of electron-hole 
recombination, and for organic light-emitting diodes, avoiding the 
formation of triplet excitons and enhancing fluorescence efficiency. 

The key photophysical processes in an organic photovoltaic cell 
(OPV) are illustrated in Fig. la. In the first step, photogenerated exci- 
tons are dissociated by charge transfer across the donor-acceptor inter- 
face, leading to either long-range charge separation or the formation of 
bound interfacial charge transfer states? (CTSs). Such bound charge 
pairs then decay to the ground state by means of geminate recombina- 
tion. Spin must be taken into account when considering CTSs because 
they can have either singlet (‘CT) or triplet @CT) spin character, which 
are almost degenerate in energy”. Dissociation of photogenerated sing- 
let excitons leads to the formation of only 'CT states, owing to spin 
conservation. In contrast, recombination of spin-uncorrelated charges, 
that is, bimolecular recombination, should lead to the formation of ‘CT 
and °CT states in a 1:3 ratio, according to spin statistics. Spin-singlet 
states can recombine to the ground state through either luminescence 
(which is slow for this intermolecular donor-acceptor process) or non- 
radiative decay®. For °CT states, decay to the ground state is spin- 
forbidden and, hence, both radiative and non-radiative processes are 
very slow. However, if the energy of the T; state is less than the °CT 


energy (as is required to maximize open circuit voltage, Voc; refs 7, 8), 
then °CT can relax to T). 

The most efficient OPV systems comprise nanoscale (<5 nm) 
domains of pure fullerene acceptor and domains of fullerene inti- 
mately mixed with a polymer donor®"®. These length scales are smaller 
than the Coulomb capture radius, 7, in organic semiconductors 
(kpT = e7/4neger, where kp is Boltzmann’s constant, T is the temper- 
ature, e is the electron charge, and é and ¢ are respectively the vacuum 
and relative permittivities), which is ~16nm at room temperature 
owing to the low dielectric constant of these materials'’ (¢ ~ 3-4). 
This leads to a high rate of electron-hole encounters that could pro- 
duce Coulombically bound CTSs. This model for recombination and 
the importance of spin statistics are well established in organic light- 
emitting diodes, where the formation of (non-luminescent) triplet 
excitons through bimolecular recombination is a major loss mech- 
anism’. Efforts to overcome this problem have focused on the use of 
metal-organic complexes to induce spin-orbit coupling’* and, more 
recently, on the use of low-exchange-energy materials that can pro- 
mote intersystem crossing from T, to S, (ref. 13). 

In contrast, for OPVs electron-hole encounters have been thought 
of as terminal recombination events!!!*!°, which, as noted above, is at 
odds with the high external quantum efficiencies (EQEs) demon- 
strated in empirically optimized systems*. Moreover, the roles of spin 
and the nature of the intermediate bound CTSs formed after electron- 
hole capture have not been explored. Here we demonstrate that the 
recombination of these bound states is mediated not only by ener- 
getics, but also by spin and delocalization, allowing for free charges to 
be reformed from these bound states and thus greatly suppressing 
recombination. 

Figure 1b shows the structures of the two polymers and three full- 
erene derivatives used as electron donors and, respectively, acceptors in 
this study. PCgo>BM (phenyl-Cgo-butyric acid methyl ester), PC7>BM 
(phenyl-C;)-butyric acid methyl ester) and ICg9MA (indene-Cgo 
monoadduct, referred to as ICMA) are mono-substituted fullerene 
derivatives'®. The lowest unoccupied molecular orbital of ICMA is 
raised by less than 0.1 eV in comparison with PC, 9BM, whereas in 
IC¢oBA (indene-Cgo bisadduct, referred to as ICBA), a bis-substituted 
derivative, the energy of the lowest unoccupied molecular orbital is 
raised by about 0.2 eV (refs 17, 18) The donor copolymer PIDT-PhanQ 
(poly(indacenodithiophene-co-phenanthro[9,10-b]quinoxaline)) was 
chosen for this study because in it the spectral signatures of charges 
(hole polarons) and triplets are considerably different'*”’. As we explain 
below, this allows us to resolve temporally the interconversion between 
charges and triplets. It has been recently demonstrated that 1:3 blends of 
PIDT-PhanQ with PC,oBM give excellent photovoltaic performance 
with internal quantum efficiencies >80% and power conversion effi- 
ciencies >4% (ref. 18). In contrast, blends with either ICMA or ICBA 
give lower performance with power conversion efficiencies of 2.9%. 
PCPDTBT  (poly[2,6-(4,4-bis-(2-ethylhexyl)-4H-cyclopenta[2,1-b;3, 
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Figure 1 | Photophysical process in an OPV and molecular structures. 

a, State diagram representing the various photophysical processes in an OPV. 
Conversions between excited state species are shown in blue and recombination 
channels are shown in red. S; and T; are the lowest-energy singlet and triplet 
excitons, respectively. Here we define the CTS energy as the energy of the 
relaxed, Coulombically bound electron-hole pair across the heterojunction. 
Process 1: photoexcitation creates a singlet exciton. Process 2: the singlet 
exciton is ionized at a heterojunction, leading to the formation of 'CT states that 
separate into free charges (FC) with high efficiency. Process 3: bimolecular 
recombination of electrons and holes leads to the formation of ‘CT and °CT 
states in a 1:3 ratio, as mandated by spin statistics. The 'CT state can recombine 
to the ground state (process 6). Process 4: for the 3CT state, recombination to 
the ground state is spin-forbidden, but relaxation to the T; state is energetically 
favourable. Process 5: once formed, triplet excitons can return to the ground 
state through an efficient triplet-charge annihilation channel. Under 
favourable conditions, as explained in the text, the time required for CTSs to 
reionize to free charges, t3; (process 3), is less than the time required for 
relaxation to Tj, t4 (process 4). Thus, CTSs are recycled back to free charges, 
leading to a suppression of recombination. b, Molecular structures of the 
donors and acceptors used in this study. Me, methyl. 


4-b']-dithiophene)-alt-4,7-(2,1,3-benzothiadiazole)]) is a widely studied 
low-bandgap polymer. Despite extensive research, the performance of 
PCPDTBT:PC;9BM blends remains modest, with EQE ~ 50% (ref. 20). 
Blends with ICMA or ICBA have even lower performance. Absorption 
spectra, EQE and current density (J)-voltage (V) curves are provided in 
Supplementary Information. 

For all the studied blends, the energy of the CTS is greater than that 
of T,. For PIDT-PhanQ blends, the energies of the CTSs have been 
previously established using their weak photoluminescence and were 
found to be 1.31, 1.36 and 1.44 eV for PIDT-PhanQ:PC,,BM, PIDT- 
PhanQ:ICMA and PIDT-PhanQ:ICBA, respectively'®. The energies of 
T, in PIDT-PhanQ, PCPDTBT and the fullerene derivatives have been 
established to be 1 (ref. 18), 1 (ref. 21) and 1.5 eV (ref. 22), respectively. 
The CTS energy of PCPDTBT:PC;)BM blends has previously been mea- 
sured to be 1.2 eV (ref. 23). The CTS energies of PCPDTBT:ICMA and 
PCPDTBT:ICBA are thus greater than this. Therefore, the molecular 
triplet exciton of the donor polymer is the lowest-energy excited state 
for all the studied blends. We note that this is the standard configuration 
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in the current generation of donor-acceptor systems, driven by the need 
to maximize Voc, which mandates that the charge transfer level lie close 
to S, (refs 7, 8). 

Here we investigate thin films of these blends using high-sensitivity 
transient absorption spectroscopy with a broad spectral and temporal 
range (Methods). Figure 2a shows the transient absorption spectra of a 
1:3 PIDT-PhanQ:PCg9BM blend. A broad photoinduced absorption 
(PIA) feature is formed between the wavelengths 1,100 and 1,500 nm 
within the instrument response time (2 ns) and decays over several 
hundred nanoseconds without spectral evolution. The long lifetime of 
the signal and the fact that it is not observed in pristine films of PIDT- 
PhanQ (Supplementary Information) rules out PIA by singlet exci- 
tons. In contrast, efficient photogeneration of charge is expected in this 
blend and, thus, the PIA is assigned to charges (hole polarons) on the 
polymer. Figure 2b shows equivalent spectra for a PIDT-PhanQ:ICMA 
blend. Here, at the earliest times, the shape of the PIA is similar to that 
for PIDT-PhanQ:PC,9BM (Fig. 2a), but at later times we observe 
spectral evolution. Between 1,100 and 1,170 nm, the signal decays with 
time. However, between 1,300 and 1,500 nm the PIA increases for the 
first 50 ns. The spectrum is also seen to broaden and redshift. This 
spectral evolution is even more pronounced in the PIDT-PhanQ:ICBA 
blend shown in Fig. 2c. Thus, unlike the PIDT-PhanQ:PC, BM spec- 
trum, which shows no spectral evolution and is consistent with the 
decay of a single excited state, the PIDT-PhanQ:ICMA and PIDT- 
PhanQ:ICBA spectra suggest that a second excited state with a PIA 
overlapping the PIA of charges is being formed on timescales of tens to 
hundreds of nanoseconds. 

Figure 2e compares the normalized kinetics of PIDT-PhanQ:PC.9>BM 
and PIDT-PhanQ:ICBA blends. The PIDT-PhanQ:PCg9BM blend (cir- 
cles) shows no difference between the kinetics of the 1,100-1,200-nm 
and 1,400-1,500-nm regions, supporting the presence of only one 
excited-state species. In contrast, for the PIDT-PhanQ:ICBA blend 
(squares), a large difference in the kinetics of the two regions is observed. 
The rise time of the low-energy region is much longer than for the 
higher-energy region, indicating the growth of a second long-lived 
excited state species on nanosecond timescales. 

Figure 2f compares the normalized kinetics of the 1,400-1,500-nm 
region in PIDT-PhanQ:ICBA for different values of pulse fluence. A 
clear dependence on pulse fluence is observed, with rise times (to the 
signal maximum) as large as 80 ns. Similar fluence dependence for the 
rise time is not observed for the 1,100-1,200-nm region (Supplemen- 
tary Information), with the signal maximum occurring within the rise 
time of the instrument. The rise time of the 1,400-1,500-nm region is 
also fluence dependent in PIDT-PhanQ:ICMA but not in PIDT- 
PhanQ:PC, BM (Supplementary Information). This fluence depend- 
ence in ICBA and ICMA blends indicates that the second excited-state 
species growing in is formed by bimolecular processes. 

The overlapping spectra of the excited states make the analysis of 
their kinetics difficult. To overcome this problem, we use a genetic 
algorithm” that allows us to extract the individual spectra and kinetics 
from the data set (Methods). Figure 2d shows the two spectra (solid 
lines) that the algorithm extracts from the PIDT-PhanQ:ICBA spec- 
trum in Fig. 2c. The spectrum in blue is the charge (hole polaron) and 
the one in red is the triplet exciton on PIDT-PhanQ. These assign- 
ments are based on previous continuous-wave PIA experiments’® as 
well as early-time transient absorption measurements (Supplementary 
Information). 

From the spectra and kinetics presented in Fig. 2, we can now 
observe that charges are formed within the instrument response time 
(2 ns) in all blends. For the 1:3 PIDT-PhanQ:PCBg9M blend presented 
in Fig. 2a, charges then decay on a 1-,1s timescale and no triplet forma- 
tion is observed. For PIDT-PhanQ:ICMA and PIDT-PhanQ:ICBA, 
triplet excitons are formed through bimolecular recombination on 
nanosecond timescales before decaying. 

Figure 3a, b shows the kinetics, at various fluences, extracted from 
the genetic-algorithm-based global analysis for PIDT-PhanQ:ICMA 
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Figure 2 | Excited-state spectra and kinetics for 
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and PIDT-PhanQ:ICBA. The extracted kinetics clearly demonstrate 
that triplets grow in as charges decay. We consider that the primary 
decay channel for triplets is triplet-charge annihilation, owing to the 
high charge densities present, and model the time evolution of the 
system (Fig. 3a, b, solid lines) using the equation 


dNr a dp 
ae tay BNP (1) 


where p is the charge concentration, Ny is the triplet concentration, « is 
the fraction of decaying charges that form triplets and f is the rate 
constant for triplet-charge annihilation. 

Values of f vary by a factor of two with fluence, and at a fluence 
of 2uJcm* for the PIDT-PhanQ:ICBA blend we obtain a value of 
0.58 for % and a value of 2.2 x 10° '°cm?s ' for 8 (Supplementary 


species. f, Fluence dependence of the low-energy 
region (1,400-1,500 nm) for PIDT-PhanQ:ICBA. 
The fluence-dependent growth of the feature 
demonstrates that the second excited-state species, 
triplets, are formed through bimolecular processes. 


Information). This demonstrates that a large fraction of charges 
undergo bimolecular recombination, mediated by *CT, to form triplet 
excitons. Once formed, triplets are quickly quenched as a result of 
triplet-charge annihilation, as indicated by the high value of f. This 
is important: given sufficient time, triplets could be re-ionized through 
thermal excitation to CTSs. However, the presence of a strong triplet- 
charge annihilation channel means that recombination to triplets is a 
terminal loss and makes it a major loss pathway in OPVs. 

We now turn to the question of whether the time taken for relaxa- 
tion from °CT to T, (process 4 in Fig. 1a, with an associated timescale 
t4) is fast and, if not, whether there are competing processes for the 
decay of 3CT. As noted earlier, for all PIDT-PhanQ:fullerene blends 
the charge transfer energy is greater than the T; energy, making relaxa- 
tion from *CT to T, energetically favoured. However, for the more 
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efficient 1:3 PIDT-PhanQ:PC,g BM blend, no triplet formation is 
observed at room temperature (Fig. 2a). But at low temperatures 
(<240K), bimolecular triplet formation is observed in this blend 
(Fig. 3c; temperature-dependent kinetics of the raw data are provided 
in Supplementary Information). The solid lines are fits using the model 
described in equation (1). This result suggests that there is a thermally 
activated process that competes with relaxation to T,. We consider this 
process to be the dissociation of *CT back to free charges. This is based 
on the fact that no other excited-state species are observed for this 
system (Fig. 2). Thus, at high temperatures (>240 K) the dissociation 
of *CT back to free charges (process 3 in Fig. 1a, with an associated 
timescale t3) out-competes relaxation of 3CT to T,; that is, ty > 3. At 
lower temperatures, this dissociation process is suppressed, such that 
T4 < T3, leading to a build-up of triplet excitons (Fig. 3c). 

The above result raises the question of why triplet formation is 
observed in ICBA and ICMA blends at room temperature but is 
out-competed by dissociation back to free charges in the 1:3 PIDT- 
PhanQ:PC¢9BM blend. As noted above, the charge transfer levels of the 
ICMA and PCgoBM blends are within 50 meV of each other and, 
hence, a simple energetics argument is unlikely to explain this differ- 
ence. Our previous work on CTSs formed at early times through the 
ionization of excitons at heterojunctions suggested that their dissoci- 
ation was mediated by charge wavefunction delocalization’. We pro- 
pose that the same mechanism is applicable to CTSs formed through 
bimolecular recombination. It is known that PCBM forms large aggre- 
gates efficiently, in contrast to other fullerenes, and that aggregation aids 
charge separation”. This effect is most probably due to delocalization 
of the electron wavefunction over the PCBM aggregates—fullerenes form- 
ing smaller aggregates would lead to more localized electron wavefunc- 
tions. This would imply that CTSs formed through recombination were 
more loosely bound in PIDT-PhanQ:PC¢oBM (1:3 blends) than in PIDT- 
PhanQ:ICMA and PIDT-PhanQ:ICBA and were thus more susceptible to 
dissociation back to free charges. To test this hypothesis, we study the 
recombination dynamics ina 1:1 PIDT-PhanQ:PC,)BM blend spun from 
chloroform. The lower fullerene concentration and low-boiling-point 
solvent lead to a more intimate blend and arrest the growth of large 
fullerene aggregates. This is confirmed by grazing-incidence small-angle 
X-ray scattering measurements (Supplementary Information), which 
also show formation of smaller aggregates in the ICMA and ICBA blends 
than in the 1:3 PIDT-PhanQ:PC, BM blend. Bimolecular triplet forma- 
tion is observed in the 1:1 PIDT-PhanQ:PC, BM blend (Fig. 3d), which 


shows the normalized fluence dependence of the triplets (raw data and 
charge dynamics are shown in Supplementary Information). Thus, by 
disrupting fullerene aggregation and, hence, charge delocalization, we 
make t4 < 13. This result confirms that delocalization has a crucial role 
in recombination. 

To generalize the above results, we now study PCPDTBT blends. 
Figure 4a shows the evolution of the transient absorption spectrum ofa 
1:2 PCPDTBT:ICBA blend. A broad PIA feature between 1,175 and 
1,550 nm is formed within the instrument response time (2 ns), and its 
peak blueshifts from 1,300 to 1,275 nm over tens of nanoseconds. A 
similar blueshift was observed in films of PCPDTBT:PC79BM and 
PCPDTBT:ICMA (Supplementary Information). The triplet spectrum 
extracted from a genetic algorithm analysis of the blends (Fig. 4b, solid 
red line) shows excellent agreement with the measured triplet spectrum 
in neat PCPDTBT (Fig. 4b, dashed red line; see Methods). The triplet 
peak at higher energy, with respect to the charge, explains the blueshift- 
ing of the transient absorption spectrum in Fig. 4a as triplets grow in. 

Figure 4c shows the fluence dependence of the charge and triplet for 
a PCPDTBT:PC, BM film, similar to those shown in Fig. 3a, b. The 
solid lines are fits to the experimental data obtained using equation (1), 
and support the general applicability of the presented model. The result 
also explains why PCPDTBT blends have only modest EQEs”’, with 
recombination to triplets being a major loss mechanism even for the 
PC7oBM blend. 

On the basis of these results, we can now propose a new photophy- 
sical model of recombination in OPVs, summarized in Fig. la. Within 
working devices, the high charge densities present”® (10'°-10'” cm’ *) 
lead to bimolecular electron-hole capture events forming CTSs with 
both spin-singlet and spin-triplet character, 'CT and *CT. Recombina- 
tion from °*CT back to the ground state is spin-forbidden, but in most 
systems T, is lower in energy than CT, such that energetic relaxation to 
bound triplet excitons is favourable (with an associated time t,). The 
high charge densities then result in rapid quenching of these triplets. 
Thus, for most blends two recombination channels exist, one by means 
of triplets and the other through the radiative and non-radiative 
relaxation of the ‘CT states. However, as demonstrated in the PIDT- 
PhanQ:PC, BM system, when the acceptor is well ordered (encour- 
aging wavefunction delocalization) the reionization of CTSs back to 
free charges (with an associated time t3) can occur faster than relaxation 
to T, (t4 > 3). In this case, the triplet recombination pathway is turned 
off, leaving only recombination through the singlet channel. 


a c 8.0 1 1 7 
ie PCPDTBT:PCBM J 
= 
iS) 4.0F Charges 
K 1-2 ns 5 : al 
7 — 2-5 ns 0 1 1 
= 30-50 ns a fe) Fluence 
—100-500ns Ky — 3.5 uJ om? 
b = 20-F Iw es : ——2.5uJcm? | 
— Polaron Lee foo? Triplets 4.5 ud em 
on — Triplet (GA extracted) BK Se om 
2 -- Triplet (measured) 7 
iN 4 
@ 
E 
fe} 
& 
& 
KR 
dq 
SSH SHS SH SS 
SGD 1D? oD? WH? DO 
Av nv P P Re Ae 2 rn? 


Wavelength (nm) 


Figure 4 | Triplet and charge kinetics for PCPDTBT blends. a, Temporal 
evolution of the transient absorption spectrum for PCPDTBT:ICBA, excited 
with an excitation fluence of 2 pJcm *. Temporal slices are averaged over the 
indicated time periods and smoothed. A blueshift of the spectra from a peak at 
1,300 nm to 1,275 nm can be seen over the first 100 ns. b, Triplet spectrum 
extracted from the genetic algorithm (GA) analysis (solid red line) and that 
measured by doping a PCPDTBT thin film with a triplet sensitizer (dashed red 
line). The blue line shows the charge spectrum as measured 50 ps after 
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photoexcitation, which is sufficient time for charge generation but not enough 
for triplet formation to begin. c, Charge and triplet dynamics (circles) for 
PCPDTBT:PC; BM, extracted from the global genetic algorithm analysis, 
analogous to those shown in Fig. 3a, b. Charges are formed within the 
instrument response time in all cases. The growth of triplets is fluence 
dependent, with a maximum population attained at later times for lower 
fluences. The solid lines are fits of the experimental data using the model 
described in the text. 
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For working OPV devices, bimolecular recombination controls the 
shape of the J-V curve”, with extraction of charges at the electrodes in 
competition with recombination. Recombination to triplets can pro- 
ceed faster than extraction, and we observe triplet formation in 
working devices even under short-circuit conditions (Supplementary 
Information). As the voltage increases from short-circuit conditions 
towards Voc, charge densities and extraction times increase, leading to 
higher bimolecular recombination losses. The film measurements here 
represent the case of Voc, in which there is no extraction and recom- 
bination of all charges. We also note that any bimolecular recombina- 
tion process that is non-radiative must reduce efficiency below the 
Shockley-Queisser limit”, so avoidance of triplet formation is always 
desirable. 

We note finally that the recombination current in OPVs is analog- 
ous to the injection current in organic light-emitting diodes, where 
electrons and holes with uncorrelated spins are injected from the elec- 
trode and recombine within the active layer. Recent efforts to minimize 
losses due to the formation of non-radiative triplets have focused on 
manipulating energetic levels such that triplet states are higher in 
energy than CTSs”, or on finding systems with very low exchange 
energies’*. However, these approaches can impose restrictive design 
criteria on materials and rely on inherently slow intersystem crossing 
from triplet to singlet. What we show here is that the introduction of 
weakly bound CTSs makes it possible to shut off recombination to 
non-radiative triplets, even when they are the lowest-energy excited 
state, and to achieve efficient recombination through the singlet chan- 
nel. This insight opens a new route to high-efficiency fluorescent 
organic light-emitting diodes. 


METHODS SUMMARY 


PCPDTBT, ICMA and ICBA were obtained from 1-material, and PCggBM and 
PC;oBM from Nano-C. PIDT-PhanQ was synthesized as described previously'*. 

For the PIDT-PhanQ;fullerene thin-film samples, 1:3 polymer:fullerene blends 
(20 mg ml | in dichlorobenzene) were spun on fused-silica substrates. The PIDT- 
PhanQ:PC,oBM thin film discussed in Fig. 3d was spun from a 1:1 blend (20 mg 
ml ' in chloroform). For the PCPDTBT-fullerene thin-film samples, 1:2 poly- 
mer:fullerene blends (30 mg ml! in chlorobenzene) were spun on fused-silica 
substrates. 

For transient absorption measurements, 90-fs pulses generated in a Ti:sapphire 
amplifier system (Spectra-Physics Solstice) operating at 1 kHz were used. The 
broadband probe beam was generated in a home-built non-collinear optical para- 
metric amplifier. Pump pulses were generated using a frequency-doubled, 
q-switched Nd:YVOy, laser (532nm). Delay times from Ins to 1001s were 
achieved by synchronizing the pump laser with the probe pulse using an electronic 
delay generator. Samples were measured in a dynamic vacuum (<1 X 10° mbar). 


Full Methods and any associated references are available in the online version of 
the paper. 
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METHODS 


Transient absorption spectroscopy. In this technique, a pump pulse generates 
photoexcitations within the film, which are then studied at some later time using a 
broadband probe pulse. Although transient absorption has been widely used to 
study the photophysics of OPV blends, previous measurements have been seve- 
rally limited by three factors: first, insufficient temporal range, typically a max- 
imum of 2-ns delay between pump and probe; second, limited spectral range and a 
lack of broadband probes, which hinders the observation of dynamic interactions 
between excitations; and, last, insufficient sensitivity, which mandates the use of 
high excitation densities to create large signals. Here we overcome these problems 
by using broad temporal (up to 1 ms) and spectral windows (up to 1,500 nm) and 
high sensitivity (better than 5 X 10°). Although broad temporal’ and spectral’* 
windows have previously been achieved, all three requirements have not been met 
simultaneously before. 

The temporal window is created by the use of an electrically delayed pump pulse 
and allows for the study oflong-lived charges and triplet excitons. This was achieved 
by synchronizing the pump laser (a frequency-doubled, q-switched Nd:YVO, laser 
(532nm) with 800-ps pulse width; AOT-YVO-25QSPX, Advanced Optical 
Technologies) with the probe pulse using an electronic delay generator (SRS 
DG535, Stanford Research Systems). 

In conjugated polymers, local geometrical relaxation around charges (polaron 
formation) causes rearrangement of energy levels, bringing states into the semi- 
conductor gap and giving rise to strong optical transitions in the range 700- 
1,500 nm (ref. 2). The absorption bands of singlet and triplet excitons are also found 
to lie in the near infrared, making a broadband spectral window necessary to track the 
evolution of the excited-state species. To generate these probe pulses, a portion of the 
output of a Ti:sapphire amplifier system (Spectra-Physics Solstice) operating at 1 kHz 
was used to pump a home-built non-collinear optical parametric amplifier modelled 
after ref. 29. The probe beam was split and a portion passed through a region of the 
sample not affected by the pump, so that laser fluctuations could be normalized. The 
probe and reference signals were dispersed in a spectrometer (Andor, Shamrock SR- 
303i) and detected using a pair of 16-bit, 512-pixel linear image sensors (Hamamatsu). 

For short time measurements (pump-probe delay, <2 ns), the excitation pulses 
were generated by a TOPAS optical parametric amplifier (Light Conversion; 300- 
fs pulse width) seeded with a portion of the amplifier output, and the pump was 
delayed using a mechanical delay stage (Newport). Every second pump pulse was 
omitted electronically when using the q-switched source or, for short-time mea- 
surements, using a mechanical chopper. Data acquisition at 1 kHz was enabled by 


a custom-built board from Entwicklungsbiiro Stresing. The differential transmis- 
sion (AT/T) was calculated after accumulating and averaging 1,000 pump-on and 
pump-off shots for each data point. 

High stability of the probe beam, the use of a reference pulse to correct for shot- 
to-shot variation and the ability to read out every shot allows for a high signal-to- 
noise ratio. The high sensitivity of the experiment is essential as it allows us to probe 
the dynamics of systems when the excitation densities are similar to solar illumina- 
tion conditions”® (10'°-101” excitations cm” *; see Supplementary Information for 
calculation of excitation densities). At higher excitation densities, bimolecular 
exciton-exciton and exciton-charge annihilation processes can dominate, creating 
artefacts and making such measurements unreliable indicators of device operation 
under AM1.5G illumination”. 

All measurements were carried out under a dynamic vacuum (<1 X 10 ° 

mbar). Data obtained was then smoothed using a moving average filter in 
MATLAB. The step size of the filter was small to avoid losing spectral and tem- 
poral accuracy. For measurement of the triplet spectra of pristine PCPDTBT, the 
film was doped with an iridium complex that enhances the intersystem crossing 
rate, leading to a high triplet population. The measured triplet spectrum was found 
to be in good agreement with a previous measurement”. 
Numerical methods. To deconvolve the overlapping signatures of individual 
excited states and obtain their kinetics, we use numerical methods based on a 
genetic algorithm. The full details of this approach can be found elsewhere”*. In 
summary, the algorithm starts by generating a large population of random spectra, 
which it then breeds into successive generations of offspring using a ‘survival of the 
fittest’ approach. Once the fitness stops improving with newer generations, the 
best spectra are returned as the optimized solution. For a given solution, the fitness 
is calculated as the inverse of the sum of squared residual, with a penalty added for 
non-physical results. The parents are selected using a tournament method with 
adaptive crossover and the offspring generated using a Gaussian-function mask 
(of random parameters) as the relative weight of each parent spectrum at a given 
wavelength. 
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Onset of deglacial warming in West Antarctica 
driven by local orbital forcing 


WAIS Divide Project Members* 


The cause of warming in the Southern Hemisphere during the most 
recent deglaciation remains a matter of debate’*. Hypotheses for a 
Northern Hemisphere trigger, through oceanic redistributions of 
heat, are based in part on the abrupt onset of warming seen in East 
Antarctic ice cores and dated to 18,000 years ago, which is several 
thousand years after high-latitude Northern Hemisphere summer 
insolation intensity began increasing from its minimum, approxi- 
mately 24,000 years ago**. An alternative explanation is that local 
solar insolation changes cause the Southern Hemisphere to warm 
independently**. Here we present results from a new, annually 
resolved ice-core record from West Antarctica that reconciles these 
two views. The records show that 18,000 years ago snow accumula- 
tion in West Antarctica began increasing, coincident with increasing 
carbon dioxide concentrations, warming in East Antarctica and cool- 
ing in the Northern Hemisphere’ associated with an abrupt decrease 
in Atlantic meridional overturning circulation’. However, signifi- 
cant warming in West Antarctica began at least 2,000 years earlier. 
Circum-Antarctic sea-ice decline, driven by increasing local insola- 
tion, is the likely cause of this warming. The marine-influenced West 
Antarctic records suggest a more active role for the Southern Ocean 
in the onset of deglaciation than is inferred from ice cores in the East 
Antarctic interior, which are largely isolated from sea-ice changes. 
Exceptional records of Southern Hemisphere climate change come 
from Antarctic ice cores*®’. Most of these records are from high- 
altitude sites on the East Antarctic plateau. Questions about the reli- 
ability of the two previous deep West Antarctic ice-core records result in 
those records often being excluded from reconstructions of Antarctic 
climate**. Because the climate of West Antarctica is distinct from that of 
interior East Antarctica, the exclusion of West Antarctic records may 
result in an incomplete picture of past Antarctic and Southern Ocean 
climate change. Interior West Antarctica is lower in elevation and more 
subject to the influence of marine air masses than interior East Antarctica, 
which is surrounded by a steep topographic slope”"®. Marine-influenced 
locations are important because they more directly reflect atmospheric 
conditions resulting from changes in ocean circulation and sea ice. 
However, ice-core records from coastal sites are often difficult to inter- 
pret because of complicated ice-flow and elevation histories. The West 
Antarctic Ice Sheet (WAIS) Divide ice core (WDC), in central West 
Antarctica, is unique in coming from a location that has experienced 
minimal elevation change”, is strongly influenced by marine conditions’ 
and has a relatively high snow-accumulation rate, making it possible 
to obtain an accurately dated record with high temporal resolution. 
Drilling of WDC was completed in December 2011 to a depth of 
3,405 m. Drilling was halted ~50 m above the bedrock to avoid con- 
taminating the basal water system. WDC is situated 24 km west of the 
Ross-Amundsen ice-flow divide and 160 km east of the Byrd ice-core 
site (Supplementary Fig. 1). The elevation is 1,766 m; the present-day 
snow accumulation rate is 22 cm yr’ ' (ice equivalent) and the average 
temperature is approximately —30 °C. The age of the oldest recovered 
ice is ~68 kyr. The WDC06A-7 timescale is based on the identification 
of annual layers to 29.6 kyr ago using primarily electrical measurements 


(Methods). To validate WDC06A-7, we compare times of abrupt changes 
in atmospheric methane concentration (Supplementary Information) 
with the Greenland Ice Core Chronology 2005’* (GICC05). We also 
compare the methane variations in WDC with abrupt changes in a 
speleothem 8180 record from Hulu Cave, China. The difference in age 
between the ice and gas at a given depth is calculated using a steady- 
state firn-densification model and is always less than 500 yr. The age 
differences between WDC06A-7 and GICC05 and between WDC06A-7 
and the Hulu Cave timescale are much less than the independent time- 
scale uncertainties (Supplementary Fig. 6). 

We interpret 3180 of ice (Methods) as annual-mean surface air 
temperature, as supported by independent estimates of temperature 
from borehole thermometry’’. WDC has many similarities with other 
records (Fig. 1) and resolves Antarctic Isotope Maximum (AIM) 
events clearly. The late Holocene WDC record shows cooling, suggest- 
ing that the increase in 5'*O at Byrd over the past few thousand years 
resulted from ice advection and thinning". The abrupt increase in 5'°O 
~22 kyr ago at Siple Dome is not observed at WDC. The AIM1 peak 
and the subsequent Antarctic Cold Reversal (ACR; 14.5-12.9 kyr ago) 
are more pronounced in WDC than at Byrd and Siple Dome, possibly 
owing to discontinuous sampling of the Byrd core and thinning of 
Siple Dome. 

The most rapid warming at WDC occurred after the ACR and 
culminated at AIMO. The timing of AIM0 is difficult to define because 
it is composed of two peaks, one 11.95 kyr ago and the other 11.6 kyr 
ago. The ice accumulation rate at WDC increased abruptly by 37% in 
the 400 yr between 12.0 and 11.6 kyr ago (Supplementary Fig. 2). The 
increase in ice accumulation with little change in 5'*O shows that the 
accumulation rate is not controlled strictly by temperature. Abrupt 
changes in accumulation cannot be recognized in most other Antarctic 
ice cores because their timescales lack sufficient resolution; it is thus 
unknown whether this event is specific to WDC or whether accumula- 
tion increased abruptly over a larger portion of Antarctica. 

The coldest period at WDC was between 28 and 22 kyr ago and was 
interrupted by AIM2, a 1,000-yr warm period between 24 and 23 kyr ago. 
AIM2 is also prominent in the EPICA Dronning Maud Land (EDML) ice 
core’ but is muted or nearly absent in other East Antarctic records'* 
(Fig. 1). Other West Antarctic cores also record AIM2, although the 
low resolution of the Byrd core and the abrupt 5'°O increase 22 kyr 
ago in the Siple Dome core have made this feature difficult to discern. 
AIM2 illustrates the spatial heterogeneity of Antarctic climate variability 
during the coldest part of the glacial period. 

To investigate deglacial warming across the Antarctic continent, we 
use a sliding Wilcoxon rank-sum test (Fig. 2) to identify times of 
significant change in the 8'°O records of WDC, EDML and the 
EPICA Dome C ice core® (EDC); we convert the EDC SD record to 
380 using 3180 = (8D — 10)/8. The WDC and EDC timescales can be 
aligned at a ~150-yr-long acid deposition event'>’®, which eliminates 
the relative age uncertainty at 18 kyr ago. The rank sum test reveals 
three important features: gradual deglacial warming at WDC was 
punctuated by periods of more rapid change; the most abrupt warming 


*Lists of participants and their affiliations appear at the end of the paper. 
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Figure 1 | Antarctic Isotope Records. Water isotope ratios from nine 
Antarctic ice cores. Inset, outline of Antarctica with the ice-core locations: Law 
Dome”’ (light green), Siple Dome” (red), Byrd” (pink), Talos Dome" (khaki), 
WDC (purple, WDC06A-7 timescale), EDML’ (blue), EDC (orange), Dome 
Fuji’ (dark green), Vostok”® (black). Taylor Dome is not plotted because of 
timescale uncertainties. All records are at original resolution. Thick lines for 
WDC and EDML are 50-yr averages. EDML, EDC and Vostok use the 
Lemieux-Dudon”*! timescale. Numbers above the WDC curve indicate 

AIM events. (8'°O = ('80/"°O) ampte/('*O/"°O)vsmow — 1 and 

8D = ?H/"H)sampie/(?H/'H)vsmow — 1).) 


began at the 18-kyr-ago acid deposition event; and significant warming 
at WDC began by 20 kyr ago, at least 2,000 yr before significant warm- 
ing at EDML and EDC. 

Further insight into deglacial warming at WDC is gained by investi- 
gating the sea-salt sodium (ssNa) record (Methods). Debate remains 
about whether ssNa on millennial timescales reflects primarily sea-ice 
production or the strength of atmospheric circulation’’. In the Amun- 
dsen and Ross seas, changes in sea ice and atmospheric circulation are 
coupled because atmospheric forcing is the dominant control on sea- 
ice concentration’*. We interpret ssNa as a proxy for sea-ice extent and 
a marker of marine changes (Supplementary Information). The rank- 
sum test reveals that each rapid increase in 880, indicating warming, 
was accompanied by a decrease in ssNa, suggesting less sea ice (Fig. 2). 
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Figure 2 | Timing of rapid change in Antarctica. a, Water isotope ratios 
(5'%0, purple) and ssNa concentrations (black) from WDC on WDC06A-7. 
EDML’ (blue) and EDC? (orange) 5'80 use the Lemieux-Dudon”" timescale. 
Constants have been subtracted from 5'°O records for plotting. Magenta boxes 
indicate a 150-yr acid deposition event; the black line between EDC and EDML 
is a volcanic tie point (Methods). ssNa is plotted as 25-yr median values. b, Rate 
of change for 8/80 at WDC, EDML, and EDC and ssNa at WDC. A Wilcoxon 
rank-sum test (Methods) is used to determine significance. Significant rates of 
change are coloured by test time interval; rates of change that are not significant 
are coloured grey. 


Consistent with this, the decrease in 5'*O during the ACR was accom- 
panied by an increase in ssNa. 

The accumulation rate at WDC was inferred without assuming a 
relationship with 5'°O or temperature (Methods). Although uncer- 
tainty in the annual-layer interpretation and ice-flow history used to 
determine the accumulation rate precludes a statistical assessment 
comparable to that used for the 8'°O and ssNa records, results suggest 
that an initial increase in accumulation occurred between 18.5 and 
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17 kyr ago (Fig. 3), consistent with the rapid warming 18 kyr ago. This 
also coincides with evidence for changes in Southern Ocean upwelling”, 
atmospheric carbon dioxide concentration®”’ and Atlantic meridional 
overturning circulation? (AMOC). The accumulation increase prob- 
ably results from more frequent or stronger moisture-bearing storms 
penetrating into West Antarctica. This supports a southward shift’! 
or intensification’ of the mid-latitude westerly storm track, and is 
consistent with the hypothesis of a decrease in AMOC leading to 
Southern Hemisphere warming and Northern Hemisphere cooling’>— 
the ‘bipolar seesaw’. 

Both the WDC and the lower-resolution Byrd ice-core records show 
that warming in West Antarctica began before the decrease in AMOC 
that has been invoked to explain Southern Hemisphere warming*”’. 
The most significant early warming at WDC occurred between 20 and 
18.8 kyr ago, although a period of significant warming also occurred 
between 22 and 21.5 kyr ago. The magnitude of the warming at WDC 
before 18 kyr ago is much greater than at EDML or EDC; linear regres- 
sion of 5'°O between 22 and 18 kyr ago shows that it increased by 
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Figure 3 | Global records of deglaciation. a, Integrated annual insolation at 
latitude 65° S. b, 100-yr averages of 5'°O at WDC on WDC06A-7. ¢, 100-yr 
averages of 5D at EDC’ on the Lemieux-Dudon”' timescale. d, Relative 
accumulation rate (normalized to the mean value between 19.5 and 18.5 kyr ago) 
at WDC. Yellow shading is the uncertainty in identifying annual layers 
(Methods). e, Atmospheric CO, concentration” from EDC on the Lemieux- 
Dudon”’ timescale. f, Opal flux’’, a proxy for upwelling, from ocean sediment core 
TNO57-13-4PC in the South Atlantic. g, Pa/Th, a proxy for North Atlantic Deep 
Water (NADW) circulation’, from sediment core GCC5. Blue shading indicates a 
period with relatively abrupt changes in all palaeoclimate records (b-g). 
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2.2%0 at WDC, by 0.4%o0 at EDML and by 0.1%0 at EDC (Fig. 2). It is 
very unlikely that the 2.2% increase at WDC can be attributed to 
elevation change; this magnitude of isotope change would require 
more than 200 m of ice-sheet thinning, twice the amount of thinning 
that occurred during the Holocene epoch when the grounding 
line retreated hundreds of kilometres (Supplementary Information). 
The subdued warming at EDML and the lack of warming at EDC are 
consistent with the lack of clear AIM2 signals in some East Antarctic 
cores, and suggest that cores from the East Antarctic plateau do not 
capture the full magnitude of Southern Hemisphere climate variability. 

There is evidence that warming at WDC before 18 kyr ago is due to 
decreasing sea ice. The ssNa at WDC began to decrease 20 kyr ago, 
probably as a result of both decreasing sea-ice extent and decreasing 
strength of transport from changes in atmospheric circulation. A 
marine record from the southwest Atlantic Ocean indicates that sig- 
nificant summer and winter sea-ice retreat began before 22 kyr ago”. 
Furthermore, a reduction in sea-ice extent can explain the different 
magnitude of warming among ice-core sites before 18 kyr ago. The 
high East Antarctic plateau is largely isolated from coastal changes 
because the local marine air masses do not have the energy to rise 
above the steep coastal escarpment”. 

To illustrate the variable sensitivity of different areas in Antarctica to 
changes in sea-ice extent, we used an atmospheric general circulation 
model. Using Last Glacial Maximum (LGM) sea surface temperature 
and sea-ice boundary conditions from a fully coupled model run”, we 
performed a control run of the ECHAM4.6 atmospheric model with 
the LGM sea-ice extent and a comparison run with reduced sea-ice 
extent (Supplementary Information). Sea surface temperatures are 
prescribed; the atmospheric circulation therefore responds to the 
change in sea-ice extent but the sea-ice extent is not further affected 
by the changes in atmospheric circulation. The magnitude of sea-ice 
retreat is consistent with evidence for reduced sea ice in the southwest 
Atlantic between 22 and 18 kyr ago”. In response to the sea-ice retreat, 
all of West Antarctica and coastal East Antarctica is enriched in 
precipitation-weighted 5'%O, whereas interior East Antarctica is little 
changed or is depleted (Fig. 4). The positive 5'°O anomalies probably 
extend unrealistically far into the East Antarctic interior because of 
the low-resolution topography in the climate model. Although the 
details of the spatial pattern of 5'8O anomalies are dependent on 
model resolution and on the specified boundary conditions, the greater 
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Figure 4 | Antarctic 5180 response to sea-ice decrease. Response of 
precipitation-weighted 5'%O to an approximately zonally symmetric 
southward displacement of the sea-ice edge (Supplementary Fig. 9) in the 
ECHAM4.6 climate model run with LGM boundary conditions. 
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sensitivity of the WAIS Divide region to sea-ice decline compared with 
locations in interior East Antarctica is clear. 

Local orbital forcing is a likely cause of the inferred sea-ice change. 
Integrated annual insolation at latitude 65° S increased by 1% between 
22 and 18kyr ago. The additional annual insolation is 60 MJm 7°, 
which is enough to melt 5cmm ~ of sea ice assuming an albedo of 
0.75. The increase in integrated summer insolation, where summer is 
defined as days with insolation above a threshold”* of 275 W m 7, is 
greater than the total annual increase (Supplementary Fig. 10). Thus, 
the increase comes in summer, when it is most likely to be absorbed by 
low-albedo open water. The summer duration also begins increasing at 
23 kyr ago; longer summers and shorter winters may also contribute to 
the decrease in sea-ice extent’. The effect of an increase in insolation 
would be amplified by the sea-ice/albedo feedback. 

The abrupt onset of East Antarctic warming*®, increasing CO, (ref. 20) 
and decreasing AMOC* 18 kyr ago has supported the view that deglacia- 
tion in the Southern Hemisphere is primarily a response to changes in the 
Northern Hemisphere’. Yet the evidence of warming in West Antarctica 
and corresponding evidence for sea-ice decline in the southeast Atlantic” 
show that climate changes were ongoing in the Southern Ocean before 
18 kyr ago, supporting an important role for local orbital forcing’. Warm- 
ing in the high latitudes of both hemispheres before 18 kyr ago implies 
little change in the interhemispheric temperature gradient that largely 
determines the position of the intertropical convergence zone and the 
position and intensity of the mid-latitude westerlies”’’*. We propose that 
when Northern Hemisphere cooling occurred ~ 18 kyr ago, coupled with 
an already-warming Southern Hemisphere, the intertropical conver- 
gence zone and mid-latitude westerlies shifted southwards in response. 
The increased wind stress in the Southern Ocean drove upwelling, vent- 
ing of CO, from the deep ocean"’ and warming in both West Antarctica 
and East Antarctica. The new WDC record thus reveals an active role for 
the Southern Hemisphere in initiating global deglaciation. 


METHODS SUMMARY 


The WDC06A-7 timescale is based on measurements of sulphur, sodium, black 
carbon and electrical conductivity above 577 m (to 2,358 yr before aD 1950), and 
primarily on electrical measurements below 577 m. Using atmospheric methane as 
a stratigraphic marker, WDC06A-7 and GICC05 agree to within 100 + 200 yr at 
the three abrupt changes between 14.7 and 11.7 kyr ago; WDC06A-7 is older by 
500 + 600 yr at 24 kyr ago, by 250 + 300 yr at 28 kyr ago and by 350 + 250 yr at 
29 kyr ago (Supplementary Fig. 6). WDCO06A-7 agrees within the uncertainties 
with the Hulu Cave timescale and is older by 50 + 300 yr at 28 kyr ago and by 
100 + 300 yr at 29 kyr ago. 

We measured 5780 at a resolution of 0.5 m using laser spectroscopy with cali- 
bration to Vienna Standard Mean Ocean Water (VSMOW). We report ssNa 
concentration rather than flux because wet deposition dominates at higher accu- 
mulation rates. The accumulation-rate record was derived independently from the 
stable-isotope record using a one-dimensional ice-flow model to calculate the 
thinning function. 

Periods of significant change in 8'°O and ssNa are identified with a sliding, non- 
parametric Wilcoxon rank-sum test. The data were averaged to 25-yr resolution 
for WDC and EDML, and to 50-yr resolution for EDC. We tested pairs of adjacent 
blocks of data against the null hypothesis of equal medians, performing the test at 
all points along the record. We assessed change on multiple timescales using a 
range of block sizes corresponding to time intervals of 250-1,000 yr for WDC and 
EDML and 500-1,000 yr for EDC. We used an effective 95% a-posteriori confid- 
ence requirement; the critical significance level (p) was determined as 1 — 0.95% 
where N is the number of test realizations. 

We used the ECHAM4.6 atmospheric general circulation model at T42 reso- 
lution (2.8° by 2.8°) with 19 vertical levels and glacial sea surface temperature 
boundary conditions. 


Full Methods and any associated references are available in the online version of 
the paper. 
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METHODS 


Stable-isotope measurements of ice. Water isotope analyses were by laser 
spectroscopy” at the University of Washington. Values of 5'°O represent the 
deviation from Vienna Standard Mean Ocean Water (VSMOW) normalized'! 
to the VSMOW-SLAP standards and reported in per mil (%o). The precision of 
the measurements is better than 0.1%o. The data have not been corrected for 
advection, elevation, or mean seawater 8!°O. 

Accumulation rates. The accumulation-rate record was derived independently 
from the stable-isotope record using an ice-flow model to calculate the thinning 
function. We use a transient one-dimensional ice-flow model to compute the 
vertical-velocity profile: 


ae : pi : 
w(z C m—H )w(z)—m ( 1) b 1 
(2) )u@)—m— (oe (1) 
Here z is the height above the bed, bis the accumulation rate, m is the melt rate, H 
is the rate of ice-thickness change, p; is the density of ice, p(z) is the density profile 
and f(z) is the vertical velocity shape function computed as 
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following ref. 33. Here h is the distance above bedrock of the Dansgaard-Johnsen™ 
kink height, fg is the fraction of the horizontal surface velocity due to sliding over 
the bed and H is the ice thickness. Firn compaction is incorporated through the 
rightmost term in equation (1) and assumes a density profile that does not vary 
with time. 

A constant ice thickness was specified because the thickness change near the 

divide was probably small (~100 m) and the timing of thickening and thinning is 
not well constrained; a 100-m thickness change would alter the inferred accumula- 
tion rate by ~3%. A constant basal melt rate of lem yr ' and non-divide flow 
conditions, represented by a Dansgaard-Johnsen kink height of 0.2H, were 
assumed. We also prescribed a sliding fraction of 0.5 of the surface velocity, which 
approximates effects of both basal sliding and enhanced shear near the bed, neither 
of which is well constrained. To assess the possible range of inferred accumulation 
rates, we also used sliding fractions of 0.15 and 0.9 (Supplementary Fig. 2). The 
inferred accumulation rate was only slightly affected for the Holocene part of the 
record but differed by up to 16% for the oldest part of the record (29.6 kyr ago). 
Because the thinning function varies smoothly, the uncertainty in the timing of the 
changes in accumulation rate is only weakly affected by the uncertainty in the 
magnitude of the accumulation rate. The main uncertainty in identifying the 
timing of accumulation rate changes is the uncertainty in the timescale itself. 
During the deglacial transition, the uncertainty in the interpretation is estimated 
at 8%. The yellow shading in Fig. 3 shows this uncertainty. 
WDCO06A-7 timescale. The WDC06A-7 timescale is based on high-resolution 
(<1cm) measurements of sulphur, sodium, black carbon and electrical conduc- 
tivity (ECM) above 577 m (2,358 yr before present (BP; AD 1950); ref. 35). Below 
577m, WDC06A-7 is based primarily on electrical measurements: di-electrical 
profiling was used for the brittle ice from 577 to 1,300m (to 6,063 yr BP). 
Alternating-current ECM measurements were used from 1,300 to 1,955 m (to 
11,589 yr BP) and both alternating-current and direct-current ECM measurements 
were used below 1,955 m. The interpretation was stopped at 2,800 m because the 
expression of annual layers becomes less consistent, suggesting that all years may 
not be easily recognized. 

The upper 577 m of the timescale has been compared with volcanic horizons 
dated on multiple other timescales*’; the uncertainty at 2,358 yr BP is +19 yr. For 
the remainder of the timescale, we assigned an uncertainty based on a qualitative 
assessment of the clarity of the annual layers. For ice from 577 to 2,020 m (2-12 kyr 
ago), we estimated a 2% uncertainty based on comparisons between the ECM and 
chemical (Na, SO,) interpretations between 577 and 1,300 m, which agreed to 
within 1% (Supplementary Fig. 4). The estimated uncertainty increased during the 
deglacial transition owing to both thinner layers and a less pronounced seasonal 
cycle. We compared the annual-layer interpretation of the ECM records in an 
800-yr overlap section (1,940-2,020-m depth, corresponding to 11.4-12.2 kyr ago) 
with various high-resolution chemistry records (sodium and sulphur). We found 
overall good agreement (19 yr more in the ECM-only interpretation) but did 
observe a tendency for the ECM record to ‘split’ one annual peak into two small 
peaks. We used this knowledge in the annual-layer interpretation of the ECM 
record. We increased the uncertainty to 4% between 2,020 and 2,300 m (12.2- 
15.5 kyr ago) and to 8% between 2,300 and 2,500 m (15.5-20 kyr ago). The glacial 
period had a stronger annual-layer signal than the transition, and we estimate a 6% 
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uncertainty for the rest of the glacial. The 150-yr acid deposition event, first 
identified in the Byrd ice core’’, was found in WDC at depths of 2,421.75 to 
2,427.25 m. Because there is consistently high conductance without a clear annual 
signal, we used the average annual layer thickness of the 10 m above and below this 
section to determine the number of years within it. There are periods of detectable 
annual variations within this depth range, and they have approximately the same 
annual-layer thickness as the 10-m averages. A 10% uncertainty was assumed. 

Weassess the accuracy of WDC06A-7 by comparing it with two high-precision 
timescales: GICC05 and a new speleothem timescale from Hulu Cave. Because the 
age of the gas at a given depth is less than that of the ice surrounding it, we first 
need to calculate the age offset (Aage). We use the inferred accumulation rates 
and surface temperatures estimated from the 5'°O record constrained by the 
borehole temperature profile (Supplementary Information) in a steady-state 
firn-densification model’’. The model is well-suited to WDC because it was 
developed using data from modern ice-core sites that span the full range of past 
WDC temperatures and accumulation rates. We calculate Aage using 200-yr 
smoothed histories of surface temperature and accumulation rate, a surface den- 
sity of 370kgm_* and a close-off density of 810 kgm * (Supplementary Fig. 5a). 
The calculated present-day Aage is 210 yr, which is similar to the value, 205 yr, 
measured for WDC”. The steady-state model is acceptable for WDC because the 
surface temperature and accumulation rate vary more slowly than in Greenland. 
Because our primary purpose is to assess the accuracy of the WDC06A-7 time- 
scale, calculation of Aage to better than a few decades is not necessary. The Aage 
uncertainty between 15 and 11 kyr ago is estimated to be 100 yr. The Aage uncer- 
tainty is estimated to be 150 yr for times before 20 kyr ago because of the colder 
temperatures and lower and less certain accumulation rates. 

Because methane is well mixed in the atmosphere and should have identical 
features in both hemispheres, we use atmospheric methane measurements from 
WDC and the Greenland composite methane record* to compare WDC06A-7 
and GICC05 at six times. The age differences are summarized in Supplementary 
Fig. 6 and the correlation and Aage uncertainties are shown in Supplementary 
Table 1. In Greenland, methane and 5'°O changes are nearly synchronous**“’ and 
we therefore assume no Aage uncertainty in the Greenland gas timescale at times 
of abrupt change. An exception is at 24 kyr ago (Dansgaard—Oeschger event 2), 
when methane and 5'%O changes do not seem to be synchronous. We estimate the 
correlation uncertainty from the agreement of the methane records in Supplemen- 
tary Fig. 5. 

Speleothems can be radiometrically dated with U/Th and have smaller absolute 
age uncertainties than do annually resolved timescales in the glacial period’. 
Records of speleothem 5'8O show many abrupt changes that have been tied to 
the Greenland climate record‘™”. However, the physical link between 5'°O varia- 
tions in the caves and methane variations is not fully understood. Therefore, there 
is an additional and unknown correlation uncertainty in these comparisons. We 
compare WDC06A-7 with the new record from Hulu Cave, China, which is the 
best-dated speleothem record during this time interval. Comparisons can be made 
at only three times; our best estimate of the age differences is 100 yr or less. 

The EDC timescale can be compared with the WDC06A-7 at a ~150-yr-long 
acid deposition event'*’*. The two timescales agree within 100 yr, and we therefore 
do not adjust either timescale. The EDML timescale has been synchronized with 
the EDC timescale using sulphate matches. The sulphate match that occurs 
during the 150-yr acid deposition event is marked in Fig. 2. 

Sea-salt sodium measurements. Sea-salt sodium (ssNa) is the amount of Na 
that is of marine origin. The Na record was measured at the Trace Chemistry 
Laboratory at the Desert Research Institute. Na is one of many elements measured 
on the continuous-flow analysis system, which is coupled to two inductively coupled 
plasma mass spectrometers. The effective sampling resolution is ~1 cm. Details of 
the analytical set-up are described elsewhere****”. Sea-salt Na is calculated assum- 
ing Na/Ca mass ratios of 26.3 for marine aerosols and 0.562 for average crust 
composition’. Sea-salt Na can be influenced by volcanic activity if the ratio of Na 
to Ca is different from the sea water and crustal ratios; the spike 20 kyr ago is part of 
an Na-rich but Ca-poor volcanic event. We present ssNa concentration in the main 
text instead of ssNa flux because wet deposition dominates at higher accumulation 
rates’. For comparison, the ssNa flux is shown in Supplementary Fig. 7. 

Methane measurements. The methane concentration was measured in discrete 
samples at Oregon State University (OSU) and Pennsylvania State University 
(PSU) using automated melt-refreeze extraction and gas chromatography, with 
final concentration values reported on the NOAA04 concentration scale’. OSU 
data are corrected for gravitational fractionation, solubility and blanks as described 
in ref. 37. The gravitation fractionation correction assumes that 5'°N of N3 is 
0.3%o, a value based on late- Holocene measurements. 

PSU methods were modelled on the basis of the OSU melt-refreeze system. The 
major difference between the OSU and PSU methods is the extraction cylinders; 
glass at OSU and stainless steel at PSU. Using stainless steel cylinders carries the 
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added problem of a blank associated with CH, outgassing, which we have esti- 
mated to be 19 + 8p.p.b. We have used a calculation similar to that derived in 
ref. 37, to estimate the amount of CH, left in the vessel after refreezing; we verified 
this using artificially degassed ice samples over which standard air was introduced 
and processed. These results indicate a 3.8% reduction in the measured headspace 
CH, value relative to the original trapped air, owing to solubility effects. The 
constant solubility and blank corrections were applied to all PSU data. In general, 
replicate samples from each depth were run on separate days to ensure that the 
final averaged data were not aliased by day-to-day instrument drifts. The average 
difference between replicate analyses of 1,316 individual depths run over 4 yr was 
7 + 8p.p.b. (10). Finally, the PSU data were also corrected for gravitational frac- 
tionation by assuming that 5'°N of N> is 0.3%o throughout. 

To ensure that the PSU and OSU CH, data sets can be accurately merged into a 
single record, we performed an inter-calibration exercise involving a 100-m sec- 
tion of the WDCO06A core (400-500 m) where both labs sampled for CH, every 
2 m. By interpolating the OSU data to compare with the PSU data, we determined 
the average difference between the two labs over this 100m interval to be 
0.2 + 9.9 p.p.b. (1a). This result implies that we can merge CH, data from the 
two labs without correcting for inter-laboratory offsets. 

Wilcoxon rank-sum test. Initial inspection of the WDC isotope record showed 
that warming was pulsed. We applied a sliding Wilcoxon rank-sum statistical test”! 
to identify periods of significant change. A figure of the P values, for each indi- 
vidual Wilcoxon rank-sum test, is shown in Supplementary Fig. 8. A dashed line 
indicates the effective critical P value. Insignificant P values are plotted in grey, and 
significant P values are plotted in colours that correspond to timespan (block size) 
as in Fig. 2. The Wilcoxon rank-sum test makes no assumption of normality within 
the data and has been shown to be robust when used in windowing algorithms for 
the identification of periods of significant change in climate data*’. Our windowing 
algorithm can also be applied using the more common Student’s t-test. Though 
parametric, such an implementation has the benefit of a well-established method 
for correcting the degrees of freedom for autocorrelation within the data”. 
Applying either statistical test, we identify nearly identical periods of significant 
change in the data sets. 

Climate modelling. To assess the effects of changing sea-ice conditions on 
precipitation-weighted 5'°O in Antarctica, we used the ECHAM4.6 climate 
model”, implemented with the water isotope module”. Model simulations used 
a horizontal resolution of T42 (2.8° latitude by 2.8° longitude) with 19 vertical 
levels. The ECHAM4.6 model has been shown to reproduce Antarctic conditions 
realistically in the modern climate'**’. We used the sea surface temperatures from 
the PMIP2 fully coupled model experiments” for LGM conditions ~21 kyr ago. 
Those sea surface temperatures are prescribed as a model boundary condition for 
the atmospheric model runs with ECHAM4.6. We used a modern Antarctic ice- 
sheet configuration because the LGM configuration remains poorly known. 

Model experiments were designed to test the sensitivity of 8'°O to changes in 
sea-ice extent. In the control experiment, sea ice forms at —1.7 °C and the model 
grid cell is set to 100% concentration below this threshold. The latitude of sea-ice 
coverage is decreased by lowering the ocean surface temperature threshold at 
which sea ice forms in the model. For the run with decreased sea ice, the freezing 
point was lowered from —1.7 to —3.7 °C. The amount of sea-ice reduction is not 
zonally uniform around Antarctica because of asymmetric gradients in the pre- 
scribed sea surface temperature. We note that model sea surface temperatures do 
not change whether model sea ice is present or not. Newly formed open water in 
the run with reduced sea ice is below the freezing point. 


Integrated insolation. We calculate integrated annual insolation at latitude 65° S 
following the tables prepared in ref. 26. We also calculate integrated ‘summer’ and 
‘winter’ insolation using a cut-off of 275 W m ” (ref. 26; Supplementary Fig. 10). 
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Digit loss in archosaur evolution and the interplay 
between selection and constraints 


Merijn A. G. de Bakker!, Donald A. Fowler’, Kelly den Oude’, Esther M. Dondorp't, M. Carmen Garrido Navas’, 
Jaroslaw O. Horbanczuk?, Jean-Yves Sire’, Danuta Szczerbinska* & Michael K. Richardson! 


Evolution involves interplay between natural selection and develop- 
mental constraints’ *. This is seen, for example, when digits are lost 
from the limbs during evolution’**. Extant archosaurs (crocodiles 
and birds) show several instances of digit loss*** under different 
selective regimes, and show limbs with one, two, three, four or the 
ancestral number of five digits. The ‘lost’ digits sometimes persist 
for millions of years as developmental vestiges” '°. Here we examine 
digit loss in the Nile crocodile and five birds, using markers of three 
successive stages of digit development. In two independent lineages 
under different selection, wing digit I and all its markers disappear. 
In contrast, hindlimb digit V persists in all species sampled, both as 
cartilage, and as Sox9- expressing precartilage domains, 250 million 
years after the adult digit disappeared. There is therefore a mismatch 
between evolution of the embryonic and adult phenotypes. Alllimbs, 
regardless of digit number, showed similar expression of sonic hedge- 
hog (Shh). Even in the one-fingered emu wing, expression of posterior 
genes Hoxd11 and Hoxd12 was conserved, whereas expression of 
anterior genes Gli3 and Alx4 was not. We suggest that the persistence 
of digit V in the embryo may reflect constraints, particularly the 
conserved posterior gene networks associated with the zone of pola- 
rizing activity (ZPA‘'). The more rapid and complete disappearance 
of digit I may reflect its ZPA-independent specification, and hence, 
weaker developmental constraints. Interacting with these constraints 
are selection pressures for limb functions such as flying and perch- 
ing. This model may help to explain the diverse patterns of digit loss 
in tetrapods. Our study may also help to understand how selection 
on adults leads to changes in development. 

Digit loss is defined as the complete loss of all phalanges and the 
metapodial bone; it should be distinguished from digit reduction, in 
which only phalanges are lost*. Digit loss can be adaptive. It reduces the 
mass of the distal limb, and therefore its moment of inertia; this con- 
serves energy during running and flying’*"*. As a result, there is dir- 
ectional selection for digit loss in the wing of all birds studied here, and 
in the flying ancestors of the ratites'’. Hindlimb digit loss is similarly an 
adaptation for running, and is seen in ratites, ungulates and some 
dinosaurs. 

In the flightless ostrich, the wings are miniaturized, but still have 
three digits, presumably because they are used for sexual display, to 
help in turning (banking) while running, and to shade the eggs. The 
emu wing has none of these functions'*”” and has lost digits because of 
relaxation of stabilizing selection through disuse, so that changes in the 
wing are selectively neutral'®’”. Selection pressures may favour the 
retention of digits. This may be the case with hindlimb digit I in many 
Neoaves. This digit is partly reduced, having only a vestigial metatarsal 
I (Figs 1 and 2)*°, nevertheless, its distal part is developed fully, pre- 
sumably because it is important for perching”®. 

Given these varying selection pressures, the patterns of digit reduc- 
tion in tetrapod evolution are complex, making it difficult to identify 


consistent trends or rules (Supplementary Fig. 1). For example, in 
placental mammals with digit loss, digit I is usually (but not always) 
the first to disappear (Supplementary Fig. 1). In pigs, digits III and IV 
are the main functional elements; digits II and V are smaller and digit I 
is vestigial’’. Lizards frequently lose digit I in evolution, but not digit V 
(although the latter frequently loses phalanges’). Turtles and tortoises 
may lose digit I from the hand, but not from the foot®. In the hindlimb 
of archosaurs, digit V is typically reduced or lost. 

Species differences in digit loss are also seen after chemical treat- 
ment of embryos. When such treatment entirely eliminates a digit from 
the foot, it is commonly digit I in anurans but digit V in urodeles'’. In 
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Figure 1 | Changes in adult digit number across archosaur phylogeny. 
Phylogeny and divergence estimates based on refs 15, 21. Schematic drawings 
of the fore- and hindlimb of the studied species, made from original museum 
specimens (Supplementary Table 3). The number of digits in the adult skeleton 
varies from five (in the hindlimb of the crocodile) to one (in the wing of the 
emu). Yellow, digit I; orange, digit II; red, digit III; purple, digit IV; blue, digit V. 
Myr ago, million years ago. Asterisks indicate digit III. 
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Figure 2 | Comparison of developmental and adult limb phenotypes in 
archosaurs. Shh expression at early digital plate stage, Sox9 expression of the 
early prechondrocyte marker Sox9 (refs 9, 20) at late digital plate (paddle) stage; 
cartilage, Alcian blue staining for cartilage at the digital ray stage; and bone, 
digital radiographs of adult skeletons from recently prepared specimens in 
which the soft tissue has not been removed. In all pictures, anterior is to the top 
and distal to the right. Asterisks indicate digit III. 


both cases, the digit most sensitive to chemical treatment tends to be 
the digit developing last in the embryo, and the digit that disappears 
first in evolution’. These findings led to the idea that digit loss, as with 
many other morphological patterns, may be determined by a combina- 
tion of developmental constraints and natural selection**. 

The archosaur species studied here provide examples of limbs pos- 
sessing one, two, three, four and five digits (Figs 1 and 2). The study 
species are the Nile crocodile (Crocodylus niloticus), emu (Dromaius 
novaehollandiae), ostrich (Struthio camelus), chicken (Gallus gallus), 
Barbary dove (Streptopelia risoria) and zebra finch (Taeniopygia gut- 
tata). The crocodilian forelimb is the only limb in our sample that has 
all five adult digits (Figs 1 and 2)°. It also has five cartilaginous digits in 
the embryo, each preceded by an expression domain of Sox9 (Fig. 2), 
an early marker of prechondrocyte condensations”. 

Digit V has been lost from the adult skeleton in all other limbs 
sampled. Despite this, all limbs develop a transient Sox9 primordium 
in the digit V position (Fig. 2 and Supplementary Fig. 2) and all limb digit 
V domains progress to the cartilage stage (Fig. 2 and Supplementary 
Fig. 3). Therefore, considering the evolutionary timescale (Fig. 1)'°”', 
the cartilage-forming pathways for digit V have persisted at least 
250 million years after digit V disappeared from the archosaurs’ adult 
hindlimb (Fig. 3). 

Digit I is lost in the forelimbs of all adult birds studied, and in the 
hindlimbs of the emu and ostrich. It is represented only by weak 
cartilage staining in the ostrich wing*”®, and only by Sox9 and peanut 
agglutinin (PNA) expression in the chicken wing”; no Sox9 or cartil- 
age develops in the digit I position in the wings of the emu or Neoaves 
(Barbary dove and zebra finch; Fig. 2 and Supplementary Figs 2 and 3). 
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Figure 3 | Developmental sequences of digit loss mapped onto phylogeny, 
with inferred ancestral conditions. Right, schematic diagrams of forelimb and 
hindlimb pairs, showing the stage to which each digit was found to develop in 
this study. The colours indicate the final stage (Sox9 domain, cartilage or bone) 
to which that digit developed. When none of these phenotypes was detected, the 
digit is indicated by X. Left, the inferred ancestral states at all nodes in the 
cladogram (based on refs 15, 21). 


Therefore, on the basis of the markers used here, and considering the 
estimated age of the clade Aves'*”’, digit I and its underling pathways 
are no longer detectable after 120 million years (Fig. 3). 

Digit loss is carried furthest in the ostrich and emu. The ostrich 
hindlimb has only two digits in the adult skeleton, but retains a full five 
Sox9 expression domains in the embryo, followed by five chondrified 
condensations (Fig. 2)*”°. The wing in the adult emu has only a single 
digit, but in development shows distinct Sox9 domains for digits IL, III 
and IV anda weak digit V domain; this four-digit pattern is seen also in 
the cartilaginous skeleton (Fig. 2)’”. 

These data show three things: first, expression of the early cartilage 
marker Sox9 can persist even when multiple digits have been lost; 
second, there are two independent lineages in which the Sox9 domain 
for digit I has disappeared, whereas that for digit V has persisted; and 
third, there is a mismatch between the adult and embryonic pheno- 
types such that the pattern of digits in the adult skeleton is not con- 
gruent with the pattern of digit primordia in the embryo. 

One mechanism proposed** to control digit number is based on 
changes in the timing of SHH expression at the posterior margin of the 
limb bud, in the zone of polarizing activity (ZPA)''. The ZPA-SHH 
pathway is deeply conserved in evolution” and modulates the express- 
ion of posterior Hox genes, as well as anterior patterning genes such as 
Gli3 and Alx4. These in turn help to establish the digit pattern (possibly 
through a Turing-type pre-pattern mechanism”*). Loss of SHH or ZPA 
function in the mouse or chicken” leads to a malformed limb with a 
single, reduced digit. Furthermore, experimental inhibition of progres- 
sively earlier phases of SHH signalling causes correspondingly fewer 
digits to develop**”’. Finally, in the lizard genus Hemiergis, species with 
fewer digits seem to show earlier termination of Shh expression during 
limb development™*. 

Taken together, these recent studies may lead us to expect signifi- 
cant differences in the timing of Shh expression among our sample of 
archosaurs. However, we find that Shh remains expressed at least until 
the digits are already established as Sox9-expressing primordia, regard- 
less of the number of adult digits in the limb (Fig. 2 and Supplementary 
Figs 2 and 4). 

In addition to conserved patterns of posterior Shh expression, we 
also find conservation in the early posterior expression domains of 


©2013 Macmillan Publishers Limited. All rights reserved 


Hoxd11 and Hoxd12 (Fig. 4a). Even the emu wing, which is vestigial, 
miniaturized and has only one digit, shows normal early expression 
domains of the posterior genes (Fig. 4a). By contrast, the early expres- 
sion patterns of the anterior genes Gli3 and Alx4 show more evolutio- 
nary variation (Fig. 4a). This suggests that the posterior pathways are 
more constrained than anterior ones in our sample. 

We suggest that the conserved patterns of gene expression at the 
posterior margin of the limb (Fig. 2 and Supplementary Figs 2 and 4) 
could explain the persistence of digit V primordia in embryos. The 
ZPA is pivotal for the development of the limb and influences the 
phenotype of all digits except digit I’’**. We therefore suggest that 
digit I is under weaker developmental constraints than digit V. This 
interpretation is strengthened by our observation (Fig. 3) of the parallel 
disappearance of digit I pathways in two species of Neoaves, and in the 
emu. When the conserved SHH signalling comes to an end (stages 26 
and 27 in the hindlimb and forelimb, respectively; Supplementary Fig. 
4b), we suggest that it becomes easier to supress the growth and dif- 
ferentiation of digit V. Interestingly, the general evolutionary trend we 
find towards reduction in digit I markers (Fig. 3) is consistent with 
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Figure 4 | Constraints versus selection pressures in the forelimb. a, The two 
Hoxd genes show a similar posterior expression in all species. The anterior 
expressed Gli3 and Alx4 show modified expression in the emu and zebra finch. 
Thus, Gli3 is much more extensively expressed in the autopod, as is Alx4, and it 
shows an extra posterior domain in the emu wing. b, The balancing act between 
selection and developmental constraints; a model of our hypothesis to explain 
the different rates at which digit I and V pathways disappear in evolution. See 
Supplementary Fig. 5 for hindlimbs. 
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previous studies showing a gradual reduction of wing digits in avian 
evolution, as indicated by gradual loss of claws (reviewed in ref. 6). 

Our results are consistent with a model (Fig. 4b) in which an inter- 
play between natural selection and developmental constraints gives 
very different outcomes for the development of digit I and digit V in 
different lineages. A balancing act between selection and constraints 
may help to account for digit loss in other taxa. In land tortoises, for 
example, if a digit is lost entirely in one step, it is always digit I’*. 

More generally, our model may help in understanding why the 
embryo digit phenotype is not always congruent with the adult digit 
pattern. For example, we show that the same selection pressure for 
digit loss has different degrees of developmental penetrance in wing 
digit I and V (Fig. 4b). The ‘developmental penetrance’ of a change in 
adult phenotype is the degree to which early developmental pathways 
are modified in order to produce that change”. Loss of adult digit I in 
the wing has a high developmental penetrance because constraints are 
weaker; it is therefore easier to modify developmental pathways 
(Fig. 4b) and lose digit I markers completely. The converse is true of 
digit V. In summary, the interplay between selection and constraints 
may be important for shaping not only adult phenotypes, but embryo 
phenotypes as well. 


METHODS SUMMARY 


The eggs of the Barbary dove (Streptopelia risoria) and the zebra finch (Taeniopygia 
guttata) were a gift of C. J. ten Cate and K. Riebel. The chicken (Gallus gallus), 
ostrich (Struthio camelus) and emu (Dromaius novaehollandiae) eggs were obtained 
from commercial breeders or from our own colony. The obtained eggs, embryos 
and radiographs of the Nile crocodile (Crocodylus niloticus) were a gift of S. Martin 
from ‘La Ferme aux Crocodiles’ in Pierrelatte (France). In total, 260 embryos were 
photographed and analysed (see Supplementary Table 1). The hybridization pro- 
tocols were as described previously’. 

The chicken Shh plasmid was a gift from C. J. Tabin, and the Alx4 from M. 
Schwarz, Salk Institute, San Diego. The probes for Sox9, Gli3, HoxD11 and Hoxd12 
were made in-house and their sequences deposited at the National Center for 
Biotechnology Information (NCBI) (Supplementary Table 2 and Supplementary 
Fig. 6). Embryos were grouped into stages according to the staging tables of 
Hamburger and Hamilton*®. When in doubt, the decision was made based on 
the hindlimb characters as given in ref. 30. The radiographs were made of fresh 
specimens. The skeleton preparations used for the schematic drawings in Fig. 1 are 
of museum specimens or of our own collection (Supplementary Table 3). 


Full Methods and any associated references are available in the online version of 
the paper. 
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METHODS 


Incubation of eggs and collection of embryos. The chicken and ostrich eggs were 
from commercial breeders, as were some of the emu eggs. The Barbary dove and 
zebra finch eggs were a kind gift of C. J. ten Cate and K. Riebel. All chicken, dove 
and zebra finch eggs were incubated and collected in Leiden as were some of the 
ostrich, emu and crocodile eggs. The remaining emu and ostrich eggs were incu- 
bated and collected in Poland. Some of the crocodile eggs donated by ‘La Ferme 
aux Crocodiles’ were collected on site in Pierrelatte, France. 

For incubation temperatures and range of harvested stages see Supplementary 
Table 1. After candling the eggs, embryos were removed and transferred to ice-cold 
PBS ina Petri dish. The amnion was removed, and the embryo staged and fixed in 
ice-cold 4% paraformaldehyde in PBS at 4°C overnight. Embryos were grouped 
into stages according to the staging tables described previously°. When in doubt, 
the decision was made based on the hindlimb characters as given in ref. 30. The 
next day they were dehydrated in a graded methanol series and stored in 100% 
methanol at —20°C. 

Probe synthesis. For the probes synthesized in-house we isolated total RNA from 
one embryo with Trizol (Invitrogen) and reverse transcribed it with SuperScript III 
(Invitrogen) or RevertAid (Fermantas). On these templates we performed poly- 
merase chain reactions (PCRs) with specific primers and cloned the PCR products 
in the TOPOTA-PCRII vector (Invitrogen). The inserted PCR products were checked 
by Sanger sequencing. These results were deposited at NCBI (Supplementary Table 2) 
and compared with known sequences (Supplementary Fig. 6). From the sequence 
data we could also determine the restriction enzyme needed to linearize the iso- 
lated plasmid (Qiagen miniprep column) as the RNA polymerase for making the 
anti-sense RNA probe labelled with digoxigenin. For this we used Sp6 or T7 RNA- 
polymerase (Roche or Fermentas) and the digoxigenin RNA labelling mix (Roche). 
The probe was cleaned with RNeasy purification columns of Qiagen. 

Alcian blue staining. Alcian blue stains acidic polysaccharides such as glycos- 
aminoglycans in cartilages. The fixed embryos were washed in 70% ethanol fol- 
lowed by acid alcohol (70% ethanol + 1% concentrated hydrochloric acid), stained 
in 0.03% (w/v) alcian blue in acid alcohol and washed in acid alcohol. 
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Whole-mount in situ hybridization. The embryos were rehydrated through 
graded methanols, digested lightly with proteinase K (20 to 40 pg ml * PBS) for 
20 min and re-fixed in paraformaldehyde in PBS after several washes in PBST (PBS 
pH 7.2 with 0.1% Tween-20). This was followed by a pre-hybridization step at 60 °C 
for at least 3 h or until the embryo had sunk. The hybridization mixture consisted of: 
50% Formamide, 2% Boehringer blocking powder, 5% SSC (standard sodium 
citrate buffer, pH 7), 1 mg ml ! total RNA, 50 ig ml ! heparin, 0.1% Triton X-100, 
0.1% CHAPS (3-[(3-Cholamidopropyl)dimethylammonio]-1-propanesulfonate) 
and 5mM EDTA. After the pre-hybridization mix was removed we added 400 ng 
ml ' specific probe to fresh hybridization mixture preheated to 60°C before 
adding it to the embryo; it was incubated in this mix at 60°C overnight with 
shaking. The next day the specific probe mixture was removed, collected and 
stored at —20°C for reuse. 

Several stringent washes were done at 60°C to remove non-specifically bound 
probe (2 X SSC, 0.1% CHAPS, 50% formamide); (2 X SSC, 0.1% CHAPS); (0.2 x 
SSC, 0.1% CHAPS). After washing several times at room temperature (20 °C) with 
TBST (0.1 M tris buffered saline, pH 7.5, 0.1% Tween-20) the embryos were pre- 
incubated with 10% sheep serum in TBST for 90 min at room temperature fol- 
lowed by overnight incubation with sheep anti-digoxigenin conjugated to alkaline 
phosphatase (Roche; 1:5,000 dilution in 10% sheep serum in TBST at 4 °C over- 
night). The next day the non-specifically bound antibodies were washed away by 
several washes with TBST, including one overnight. The embryos were brought to 
a higher pH by washing them in NTT buffer (0.1 M sodium chloride, 0.1 M Tris- 
HCl, 0.1% Tween-20, pH 9.5). The enzyme reaction of alkaline phosphate with 
NBT/BCIP (nitro blue tetrazolium chloride/5-Bromo-4-chloro-3-indolyl phos- 
phate) or BM purple (both Roche) as substrate results in a blue precipitate. The 
development of the stain was checked regularly and stopped by washing several 
times in TBST, removing the substrate and chromogens, and lowering the pH. 
Radiography. All radiographs were made using fresh specimens. The radiograph 
of the zebra finch was made on film and later digitized at the Institute of Biology 
Leiden (IBL), the Netherlands. The chicken, emu and ostrich digital radiographs 
were made by D. van Marel of the Naturalis Biodiversity Center, the Netherlands. 
The crocodile was radiographed at ‘La Ferme aux Crocodiles’, France. 
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Emergence of structural and dynamical properties of 
ecological mutualistic networks 


Samir Suweis!, Filippo Simini?*, J ayanth R. Banavar* & Amos Maritan! 


Mutualistic networks are formed when the interactions between 
two classes of species are mutually beneficial. They are important 
examples of cooperation shaped by evolution. Mutualism between 
animals and plants has a key role in the organization of ecological 
communities’ *. Such networks in ecology have generally evolved 
a nested architecture** independent of species composition and 
latitude”; specialist species, with only few mutualistic links, tend 
to interact with a proper subset of the many mutualistic partners of 
any of the generalist species’. Despite sustained efforts”* ’° to explain 
observed network structure on the basis of community-level stabi- 
lity or persistence, such correlative studies have reached minimal 
consensus'''*, Here we show that nested interaction networks could 
emerge as a consequence of an optimization principle aimed at maxi- 
mizing the species abundance in mutualistic communities. Using 
analytical and numerical approaches, we show that because of the 
mutualistic interactions, an increase in abundance of a given species 
results in a corresponding increase in the total number of individuals 
in the community, and also an increase in the nestedness of the 
interaction matrix. Indeed, the species abundances and the nested- 
ness of the interaction matrix are correlated by a factor that depends 
on the strength of the mutualistic interactions. Nestedness and the 
observed spontaneous emergence of generalist and specialist species 
occur for several dynamical implementations of the variational prin- 
ciple under stationary conditions. Optimized networks, although 
remaining stable, tend to be less resilient than their counterparts 
with randomly assigned interactions. In particular, we show analyti- 
cally that the abundance of the rarest species is linked directly to the 
resilience of the community. Our work provides a unifying frame- 
work for studying the emergent structural and dynamical properties 
of ecological mutualistic networks”*">™. 

Statistical analyses of empirical mutualistic networks indicate that a 
hierarchical nested structure is prevalent and is characterized by nested- 
ness values that are consistently higher than those found in randomly 
assembled networks with the same number of species and interactions’®. 
Nevertheless, the degree of nestedness varies among networks. Recently*"®, 
it has been argued that nestedness increases biodiversity and begets 
stability, but these results are in conflict with robust theoretical evi- 
dences showing that ecological communities with nested interactions 
are inherently less stable than unstructured ones'*'*"* and that mutua- 
lism could be detrimental to persistence’"’. We aim to elucidate gene- 
ral optimization mechanisms underlying network structure and its 
influence on community dynamics and stability. 

There is a venerable history of the use of variational principles for 
understanding nature, which has led to major advances in many sub- 
fields of physics, including classical mechanics, electromagnetism, 
relativity, and quantum mechanics. Our goal is to determine the appro- 
priate variational principle that characterizes a mutualistic community 
in the absence of detailed knowledge of the nature and strengths of the 
interactions between species and their environment. We begin by showing 
that increases in the abundances of the species lead to an increase in the 


total number of individuals (henceforth referred to as the total popu- 
lation) within the mutualistic community. We then show that, under 
stationary conditions, the total population is directly correlated with 
nestedness and vice versa. Finally, we demonstrate that nested mutua- 
listic communities are less resilient than communities in which species 
interact randomly. These results suggest a simple and general optim- 
ization principle: key aspects of mutualistic network structure and its 
dynamical properties could emerge as a consequence of the maximiza- 
tion of the species abundance in the mutualistic community (see Fig. 1). 

We consider a community comprising a total of S interacting species 
(see Methods), in which population dynamics is driven by interspecific 
interactions. We model mutualistic and competitive species interactions 
using both the classical Holling type I and II functional responses'** 
(Supplementary Information). We perform a controlled numerical experi- 
ment at the stable stationary state by holding fixed the number of spe- 
cies, the strengths of the interactions, and the connectance (the fraction 
of non-zero interactions), and seek to maximize individual species 
population abundances by varying the network architecture. The sim- 
plest approach consists of repeatedly rewiring the interactions of a 
randomly drawn species so as to increase its abundance, that is, each 
selected species attempts to change its mutualistic partners in order to 
enhance the benefit obtained from its interactions (see Methods and 
Supplementary Information). The optimization principle may then be 
interpreted within an adaptive evolutionary framework within which 
species maximize the efficiency of resource usage’’”? and minimize 
their chances of becoming extinct owing to stochastic perturbations”’””. 
Interestingly, we find that enhancements in the abundance of any given 
species most often results in growth of the total population along with a 
concomitant increase of the nestedness (see Fig. 1). These results dem- 
onstrate the existence of a correlation between nestedness and species 
abundance and highlight a non-trivial collective effect through which 
each successful switch affects the abundances of all species, leading to 
an inexorable increase, on average, of the total number of individuals in 
the community. 

In order to make analytical progress and to better understand the 
correlation found between the optimization of individual species abun- 
dances, the total number of individuals in the community and nested- 
ness, we turn to a mean field approximation’ in which the mutualistic 
(and competitive) interactions are assumed to have the same magnitude. 
Within this approximation, we are able to prove that (see Supplemen- 
tary Information for mathematical details): (1) an increase in the abun- 
dance of any species more often than not leads to a net increase in the 
total population of the community; and (2) communities with larger 
total population have interaction matrices with higher nestedness and 
vice versa. The intraspecific (plant-plant and animal-animal) interac- 
tions have a key but secondary role compared to the mutualistic (plant- 
animal) interactions. The main effect of the intraspecific interactions is 
to break the degeneracy in the network overlap (Supplementary Infor- 
mation). Extensive numerical simulations in the more general, non-mean 
field case of heterogeneous interactions also confirm these findings. The 
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Figure 1 | The optimization principle. a, The maximization of species 
abundance leads to networks with a nested architecture. The optimal 
interaction matrix shown is the typical architecture resulting from averages 
performed over 100 optimal networks starting from random realizations. The 
blue scale is a measure of the average mutualistic strength normalized with 
respect to the maximum-strength interaction. b, Because of mutualism, the 


nestedness distributions of the optimized mutualistic networks shift 
markedly to higher values than their random network counterparts 
(see Figs 2a, b and 3). Monte Carlo simulations substantiate the strong 
correlation between the total population and the nestedness of the mutua- 
listic interaction network (Fig. 2c). Our analytic calculations show that, 
for identical increments in population abundances, a community charac- 
terized by weak mutualism has a larger increase in nestedness than one 
with strong mutualistic interactions (Supplementary Information). This 
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Figure 2 | Relationship between nestedness and species abundance. 


a, b, Histograms of the nestedness probability density for optimized mutualistic 
networks (shown in red) using the individual species optimization algorithm 
using Holling type I (HTI) (a) and Holling type II (HTII) (b) saturating 
functions. The histograms for the corresponding null model randomizations 
are also shown. In null model 0 (refs 6, 27), we preserve the dimensions and the 
connectivity of the optimized interaction network M with a random placement 
of the edges. In null model 1 (refs 6, 27), we also conserve the average number of 
connections for each plant and insect. The plots are obtained using 100 
realizations of the optimization algorithm presented in the main text. In each 
realization, a new initial interaction matrix, M, is extracted with the same 
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optimization of the abundances of the individual species involved in the 
interaction rewiring results in an overall increase of the total population of both 
pollinators (in red) and plants (in blue). The curves represent the result of a 
typical run (no average is involved). Simulations presented here are obtained 
with Holling type II dynamics and parameters S = 50, Co = Cr = 4/S°* and 
0Q = 67 <6, (see Methods Summary). 


result suggests that, when the mutualistic interactions are strong, the 
network architecture may have a less crucial role than in the regime of 
weak mutualistic interactions, wherein optimal tuning of the architec- 
ture could lead to considerable beneficial effects for the community. 
Our results are very robust and do not depend on the details of the 
optimization algorithm, the initial condition or the transient dynamics. 
In addition to simulations of mutualistic communities starting with 
random interactions networks and then ‘reorganizing’ towards a more 
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average | = 0, variance 6g = op < o,, and connectance Cy = Cp (see 
Methods). c, We consider interaction matrices (S = 50, Cg = Cp and 

09 = or <<a,) with different values of nestedness and we calculate the 
stationary population associated with each one of them: the nestedness and the 
total abundance of individuals in the community are strongly correlated. Error 
bars represent the +1 standard deviation over 100 realizations. The 
connectance has been chosen to vary with the number of species as Cr = 4s °8, 
obtained as a best fit to empirical data (Supplementary Information). Similar 
results are obtained for different parameter values and implementations of the 
optimization algorithm. PDF, probability density function. NODF, nestedness 
metric based on overlap and decreasing fill. 
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Figure 3 | Box-whisker plots of the degrees of nestedness for optimized 
networks. a, b, The ends of the whiskers represent the minimum and 
maximum, whereas the ends of the box are the first and third quartiles and the 
black bar denotes the median. The plots show the absolute nestedness (NODF) 
(a) and relative nestedness’® (b) normalized to null model 1 (refs 6, 27) of 100 


optimal state, we also implement a more realistic scenario in which 
mutualistic communities progressively assemble and are optimized 
over the course of evolutionary timescales****, Indeed, we find that 
the final result is the same, that is, the final optimized networks display 
a nested architecture (Supplementary Information). Notably, nested 
architectures in mutualistic communities could emerge from different 
initial conditions as a result of a rewiring of the interactions according 
to a variational principle aimed at maximizing either the fitness” of the 
individual insect/plant—whose surrogate is its species abundance— 
involved in the interaction swap (species-level optimization) or the 
fitness of the whole community, measured by the total population of 
all species (community-level optimization). The intriguing fact that 
these two optimization schemes lead to similar conclusions suggests 
that group selection mechanisms”* may have had an important role in 
the evolution of cooperation among plant and pollinators™. 
Community persistence and stability are important dynamical pro- 
perties characterizing ecological networks, but the way in which the 
two are related in real systems is far from trivial’’~’. It has been sug- 
gested that mutualistic network structures lead to high community 
persistence”'®. Persistence, however, is only defined for systems out 
of their steady state’*’, and is sensitive to initial conditions, transients 
dynamics, and to the system’s distance from stationarity'''’. Here we 
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bipartite networks resulting from: species-level optimization (1, 2), 
community-level optimization (3, 4) and random mutualistic networks—null 
model 0 (refs 6, 27) (5). Parameters used here are S = 50, Cg = Cr = 4/S°* and 
Og = Op <Q 


focus on the study of community stability for the optimized stationary 
networks. Using perturbative expansion techniques, we analytically 
find that the abundance of the rarest species controls the stability of 
mutualistic communities (see Supplementary Information for mathe- 
matical details). Moreover, the optimized networks result in spontan- 
eous symmetry breaking with more-abundant generalist species and 
less-abundant specialist species (Fig. 4a). The relatively low abundan- 
ces of the specialist species make them more vulnerable to extinction 
and results in correspondingly lower community resilience, as mea- 
sured by the maximum real part of the eigenvalues of the community 
matrix community (Fig. 4b). The advantage of having a high total popu- 
lation leading to increased robustness against extinction due to demo- 
graphic fluctuations carries with it the cost associated with a lower 
resilience—the optimized network recovers from perturbations on a 
longer timescale than its random counterpart'*'*”’ (Fig. 4c). 

Several ecological factors, as well as evolutionary history, contribute 
to shaping empirical networks. Here we have shown how binary nested 
network architecture could emerge as a consequence of an optimiza- 
tion process or variational principle. An interesting unexplored issue is 
an analysis of emergent quantitative nestedness”, that is, the organi- 
zation of the interaction intensities in the optimized networks along 
with a comparison to empirical data in mutualistic networks’*. The 
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Figure 4 | Optimized networks are less resilient. a, Average species 
abundance <x> as a function of the number of mutualistic partners of a 
species. The error bars represent the +1 standard deviation confidence interval. 
Generalist species with more connections are, on average, more abundant than 
specialist species with fewer connections. The red points in the inset depict 
<x> as a function of the mutualistic strength s. b, Relationship between the 
abundance of the rarest species and system resilience given by the largest 
among the real parts (closest to zero) of the eigenvalues of the linearized 


stability matrix. The grey line shows a linear fit (R? = 0.999). c, Probability 
density function of the largest among the real parts of the eigenvalues— 
Max(Re(A))—of the optimized community stability matrix (red curve) and of 
the corresponding initial random networks (grey curve). Optimized networks 
are less resilient than their random counterparts. The plots are obtained from 
100 realizations of the community-optimization algorithm performed with 
HTI saturating function § = 50, C= 4/898 and og = op < Ge. 
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framework proposed here ought to be applicable for investigating the 
possible driving forces sculpting mutualistic network architectures in a 
variety of systems ranging from social** to economic” and other bio- 
logical*® (for example, protein interaction) networks. 


METHODS SUMMARY 

Interaction networks. The initial interaction matrix M is composed of four blocks: 
two diagonal blocks describing direct competition among plants (Qpp) and polli- 
nators (Q, 4) and two off-diagonal blocks characterizing the mutualistic interac- 
tions between np plants and ma pollinators (pa), and vice versa (I’ap). The total 
possible number of mutualistic interactions in each of these two latter blocks is 
equal to m4 X np. The connectance Cy (or Co) represents the fraction of the mutua- 
listic (competitive) interactions that are non-zero. Mutualistic interaction intensi- 
ties yi? and vy “ represent the increase of the growth rate of animal (or plant) 
species i per unit of plant (animal) biomass j and they are assigned randomly from 
the distribution |N(0,0,)|, whereas the competitive interactions are distributed as 
—|N(0,09)|. Here, N(ju¢) is the normal distribution with the mean and variance 
a chosen to have stability of the underlying population dynamics, that is, ¢ < o, 
where @, is the critical strength threshold above which, with high probability, no 
stable fixed point dynamics exist'*. 

Optimization algorithm. We start with an existing network; select a species, j, 
randomly and an existing link to one of its partner species k; we attempt a rewire 
between the j-k and the j-m links (where m is a potential alternative mutualistic 
partner), that is yj, is interchanged with y,,,. If the j-m link already exists, that is, 
Yim is different from zero, the switch leads to an interchange of interaction 
strengths; otherwise the swap corresponds to rewiring the j-k link to j-m. The 
switch is accepted if and only if it does not lead to a decrease of the population 
abundance of species j in the steady state of the new network configuration. See 
Supplementary Information for details. 
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Loss of sexual reproduction is considered an evolutionary dead end 
for metazoans, but bdelloid rotifers challenge this view as they 
appear to have persisted asexually for millions of years’. Neither 
male sex organs nor meiosis have ever been observed in these 
microscopic animals: oocytes are formed through mitotic divi- 
sions, with no reduction of chromosome number and no indica- 
tion of chromosome pairing”. However, current evidence does not 
exclude that they may engage in sex on rare, cryptic occasions. Here 
we report the genome of a bdelloid rotifer, Adineta vaga (Davis, 
1873), and show that its structure is incompatible with conven- 
tional meiosis. At gene scale, the genome of A. vaga is tetraploid 
and comprises both anciently duplicated segments and less diver- 
gent allelic regions. However, in contrast to sexual species, the 
allelic regions are rearranged and sometimes even found on the 
same chromosome. Such structure does not allow meiotic pairing; 
instead, we find abundant evidence of gene conversion, which may 
limit the accumulation of deleterious mutations in the absence of 
meiosis. Gene families involved in resistance to oxidation, car- 
bohydrate metabolism and defence against transposons are signifi- 
cantly expanded, which may explain why transposable elements cover 
only 3% of the assembled sequence. Furthermore, 8% of the genes are 
likely to be of non-metazoan origin and were probably acquired 
horizontally. This apparent convergence between bdelloids and pro- 
karyotes sheds new light on the evolutionary significance of sex. 
With more than 460 described species*, bdelloid rotifers (Fig. 1) 
represent the highest metazoan taxonomic rank in which males, her- 
maphrodites and meiosis are unknown. Such persistence and diver- 
sification of an ameiotic clade of animals are in contradiction with the 
supposed long-term disadvantages of asexuality, making bdelloids an 
‘evolutionary scandal’. Another unusual feature of bdelloid rotifers is 
their extreme resistance to desiccation at any stage of their life cycle®, 
enabling these microscopic animals to dwell in ephemeral freshwater 
habitats such as mosses, lichens and forest litter; this ability is presum- 
ably the source of their extreme resistance to ionizing radiation’. 


We assembled the genome of a clonal A. vaga lineage into separate 
haplotypes with a Nso of 260 kilobases (kb) (that is, half of the assembly 
was composed of fragments longer than 260 kb). Assembly size was 
218 megabases (Mb) but 26 Mb of the sequence had twice the average 
sequencing coverage, suggesting that some nearly identical regions 
were not resolved during assembly (Supplementary Fig. 3); hence, 
the total genome size is likely to be 244 Mb, which corresponds to 
the estimate obtained independently using fluorometry (Supplemen- 
tary Note C2). Annotation of the complete assembly (including all 
haplotypes) yielded 49,300 genes. Intragenomic sequence comparisons 
revealed numerous homologous blocks with conserved gene order 
(colinear regions). For each such block we computed the per-site syn- 
onymous divergence (Ks) and a colinearity metric defined as the frac- 
tion of colinear genes. Colinear blocks fell into two groups (Fig. 2a): a 
group characterized by high colinearity and low average synonymous 
divergence, and a group characterized by lower colinearity and higher 
synonymous divergence. The presence of two classes of colinear blocks 
is consistent with a tetraploid structure comprised of alleles (recent 
homologues) and ohnologues (ancient homologues formed by genome 
duplication). Allelic pairs of coding sequences are on average 96.2% 
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Figure 1 | Position of bdelloid rotifers among metazoans. Bdelloid rotifers 
(‘leech-like wheel-bearers’) are a clade of microscopic animals (scale bar, 

100 um) within the phylum Rotifera. Photographs of Hemichordata 
(Saccoglossus), Chordata (Homo) and Ecdysozoa (Drosophila) courtesy of 
David Remsen (MBL), John van Wyhe (http://darwin-online.org.uk) and 
André Karwath, respectively. 
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Figure 2 | A locally tetraploid genome. a, Analysis of intragenomic synteny 
reveals two groups of colinear regions: alleles (in violet, regions characterized by 
a high fraction of colinear genes and low average Ks, that is, synonymous 
divergence) and ohnologues (in orange, with lower colinearity but higher Ks). 
b, Example of a genomic quartet of four scaffolds: allelic gene pairs are 
connected with violet curves and ohnologous gene pairs with orange curves. 


identical at the nucleotide level (median = 98.6%) versus 73.6% (med- 
ian = 75.1%) for ohnologous pairs. Nearly 40% (84.5 Mb) of the 
assembled genome sequence is organized in quartets of four homolog- 
ous regions A,, A>, B, and B,, of which A,;—A, and B,-B, are two pairs 
of alleles and As are ohnologous to Bs* (Fig. 2b). 

We found evidence of genomic palindromes up to 705 kb in length 
and involving up to 148 genes. The A. vaga genome contains at least 
17 such palindromic regions (Fig. 3a) reminiscent of those reported in 
the Y chromosomes of primates’. In all 17 cases, the arms of the palin- 
dromes present the colinearity and divergence signatures of allelic 
regions and do not have other allelic duplicates in the assembly, sug- 
gesting that they arose by inter-allelic rearrangements rather than by 
local duplications. In addition to these 17 inverted repeats, we 
observed three direct repeats that present the signatures of allelic 
blocks and involve up to 50 genes (Fig. 3a). The cumulative length of 
the assembly fragments (scaffolds) bearing these 20 allelic rearrange- 
ments is 7.5 Mb or 3.5% of the genome sequence. Allelic regions that 
are found on the same chromosome clearly cannot segregate during 
meiosis. Moreover, we found hundreds of colinearity breakpoints 
between allelic regions, and the total length of the scaffolds that have 
no full-length homologue in the assembly due to these breakpoints 
exceeds 109 Mb or 51% of the genome assembly (including 91 of the 
100 largest scaffolds, Fig. 3b and Supplementary Fig. 10). As a result, it 
is impossible to split the assembled genome of A. vaga into haploid 
sets: the apparent ploidy level of A. vaga is scale-dependent, with a 
tetraploid structure at gene scale versus chromosome-scale haploidy. 
Such relaxation of constraints on genome structure is reminiscent of 
other mitotic lineages such as cancer cells'° and somatic tissues’’. 

It has been proposed that, in the absence of meiosis, alleles accumulate 
mutations independently from one another, to the point that ancient 
asexuals may harbour genome-wide allele sequence divergence (ASD)” 
larger than inter-individual differences (the so-called ‘Meselson effect’). 
However, the average inter-allelic divergence of A. vaga is only 4.4% 
at the nucleotide level (3% when looking at synonymous divergence), 
which falls in the upper range reported for sexually reproducing species’. 
The absence of genome-wide ASD could be explained by low mutation 
rates and/or by frequent mitotic recombination (such as gene conver- 
sion resulting from DNA repair)’*. Although there is no evidence of 
reduced mutation rates in bdelloid rotifers compared with their cyclic- 
ally sexual sister clade the monogononts™, we found strong signatures 


Figure 3 | A genome structure incompatible with conventional meiosis. 

a, In twenty cases, allelic regions are found to occur on the same chromosome. 
All curves shown connect allelic gene pairs. On three scaffolds both allelic 
regions have the same orientation (direct repeats, in pink), whereas on the 
seventeen other scaffolds they are inverted (palindromes, in red). b, Local 
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colinearity between alleles does not extend to chromosome scale. Colours are 
arbitrary and only allelic gene pairs are represented. Asterisks highlight 
colinearity breakpoints between scaffold av1 and its allelic partners av44, av94, 
av122, av316 and av448. Further examples for other scaffolds are shown on 
Supplementary Fig. 10. 
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of recent gene conversion events in the distribution of identity track 
lengths, that is, distances between consecutive mismatches (Fig. 4a and 
Supplementary Note E1). We calculated that the probability that a given 
base in the genome experiences gene conversion is at least one order of 
magnitude greater than its probability to mutate (Supplementary Note 
El), suggesting that homologous regions in the genome of A. vaga 
undergo concerted evolution’’. Homogenization through gene conver- 
sion may either expose new mutations to selection by making them 
homozygous or remove them as they get overwritten with the other 
allelic version (Fig. 4b), thereby slowing Muller’s ratchet (that is, the 
irreversible accumulation of detrimental mutations in asexual popula- 
tions of finite sizes, Supplementary Note E2 and Supplementary Fig. 11). 
Over 8% of the genes of A. vaga are much more similar to non- 
metazoan sequences in GenBank than to metazoan ones (AI log score > 
5 (ref. 16), Supplementary Note E4) and were therefore probably 
acquired through horizontal gene transfer (HGT). This class of genes 
has significantly fewer introns per kilobase of coding sequence com- 
pared with probable core metazoan genes (AI = —45, Supplementary 
Table 2). More than 20% of genes with AI > 45 are found in quartets 
(groups of four homologous copies in conserved syntenic regions) and 
were therefore probably incorporated into the rotifer genome before 
the establishment of tetraploidy, which itself pre-dates the divergence 
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Figure 4 | Gene conversion and its evolutionary consequences in ameiotic 
organisms. a, Evidence for gene conversion between allelic regions. If we 
suppose that mutations happen at random in a Poisson process of parameter 
1/M (where M is the average distance between mutations), then the distance 
between two consecutive mismatches follows a negative exponential 
distribution where the proportion of identity tracks of length x equals e ~’“/M. 
Comparison of the observed distribution of identity track lengths with this 
theoretical distribution reveals a deficit of short tracks and an excess of long 
tracks, as expected in case of gene conversion. The same pattern was observed 
when gene-coding regions were excluded from the analysis (data not shown), 
thereby ruling out a confounding effect of selection. b, In sexual organisms, 
meiotic recombination can generate offspring with fewer or more deleterious 
mutations (hence increasing or decreasing fitness) than the previous 
generation. The same outcome is expected in ameiotic organisms that 
experience gene conversion: a deleterious allele may be overwritten by a 
beneficial or neutral one, resulting in an increase in fitness, or may overwrite it, 
resulting in decreased fitness. 
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of extant bdelloid families*. The higher the number of copies of a 
putative HGT gene, the higher its number of introns and the closer 
its guanine-cytosine (GC) content to the A. vaga genome average 
(Supplementary Fig. 22), which suggests that these parameters reflect 
the age of acquisition. We also noticed signatures of possibly very 
recent HGTs: 60 genes with AI > 45 are present in only one copy (with 
normal coverage), have no intron and have a GC content that is more 
than 1% above or below the genome average (the same scaffolds also 
bear genes of probable metazoan origin with AI <0). In summary, 
there seems to be an ancient but still ongoing process of HGT at a level 
comparable to some bacteria”. 

Some theories predict that transposable elements should be either 
absent from the genomes of asexuals'® or undergo unrestrained expan- 
sion after the switch to asexuality, potentially leading to species extinc- 
tion unless transposable element proliferation is prevented’’. We found 
that transposable elements cover about 3% of the A. vaga genome, 
which is less than the percentage reported in most other metazoans 
(including the genome of the obligate parthenogenetic nematode 
Meloidogyne incognita, 36% of which is made up of repetitive ele- 
ments”). Another surprising feature is the high diversity of transpos- 
able-element families and the extremely low copy numbers observed for 
each of them (Supplementary Table 3). Out of 255 families, the over- 
whelming majority (209) are represented by only one or two full-length 
copies (for 24 families, no full-length copies could be identified), and for 
each full-length copy there are, on average, only about ten times as 
many transposable-element fragments. This relatively low abundance 
of decayed copies and the fact that long-terminal-repeat (LTR) retro- 
transposons have identical or nearly identical LTRs (Supplementary 
Table 4) suggest that most low-copy-number families represent recent 
arrivals. This is consistent with an ongoing process of acquisition of 
transposable elements by HGT. 

This hypothesis is further supported by the significantly higher 
density of transposable elements observed around HGTs and vice- 
versa (Supplementary Note E5). If A. vaga has been acquiring trans- 
posable elements by HGT, a question that arises is what keeps their 
number lower than in most other metazoans. Many fragmented copies 
have apparently been formed through microhomology-mediated dele- 
tions. Excision of LTR retrotransposons has also been occurring 
through LTR-LTR recombination, leaving behind numerous solo 
LTRs: for example, two Junol insertions, Juno1.1 and Juno1.2, which 
were present as full-length copies in the 2006 A. vaga fosmid library”’, 
exist in the current assembly only as solo LTRs (in the same genomic 
environments and with the same target site duplications). Finally, there 
is evidence for expansion and diversification of the RNA-mediated 
silencing machinery. In addition to Dicer1 proteins, which are shared 
by all metazoans, A. vaga possesses a deep-branching Dicer-like clade 
with uncertain taxonomic placement (Supplementary Fig. 20). The 
Argonaute/Piwi and RNA-directed RNA polymerase (RdRP) families 
are also expanded (Supplementary Figs 18 and 19). It is plausible that 
these proteins participate in epigenetic silencing of transposable elements 
(as was recently observed for single-copy transgenes in Caenorhabditis 
elegans”), thereby preventing horizontally transferred transposable ele- 
ments from multiplying upon arrival. 

Overall, the genome of A. vaga comprises more genes than usually 
reported for metazoans (Supplementary Note F2), as its haplotypes 
were assembled separately. Even taking this into account, the gene 
repertoire of A. vaga features expansion of several gene families. For 
example, the genome of A. vaga comprises 284 homeobox superclass 
genes, mostly found in four copies (quartets) but not organized in 
clusters; very few ohnologues have been lost, resulting in more homeo- 
box genes than in any other metazoan genome sequenced (Sup- 
plementary Note F5). Genes putatively related to oxido-reduction 
processes are substantially more abundant in A. vaga than in other 
metazoan species, and most of the corresponding genes appear to be 
constitutively expressed (Supplementary Table 9). This is consistent 
with the recent report of an effective antioxidant protection system 
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Figure 5 | Meiotic versus ameiotic genome structures. Genes are represented 
with letters, and dashed lines connect allelic gene pairs. A meiotic genome (left) 

alternates between a haploid phase (in which a single allele of each gene is present) 
and a diploid phase (in which the genes are present in two allelic versions arranged 


in bdelloid rotifers. Carbohydrate-active enzymes (CAZymes) in 
the genome of A. vaga are also notably diverse and abundant, with 
1,075 genes falling into 202 characterized families. With 623 glycoside 
hydrolases (involved in the hydrolysis of sugar bonds) and 412 glyco- 
syltransferases (responsible for building sugar bonds), the CAZyme 
richness of A. vaga ranks highest among metazoans and is only com- 
parable to some plants such as poplars**. A. vaga has the richest rep- 
ertoire of glycoside hydrolases of any organism sequenced so far, hint- 
ing at a diversity of feeding habits; 52% of the CAZymes have an AI > 45 
and were therefore probably acquired through horizontal gene transfer. 

A. vaga has lost 1,250 genes compared with the inferred last com- 
mon ancestor of Protostomia, the genome of which comprised at least 
7,844 unique protein-coding genes (Supplementary Note E6). A total 
of 137 PFAM domains typically present in metazoans could not be 
detected in the assembled genome sequence (Supplementary Data 10). 
Of particular interest are missing domains involved in reproductive 
processes (Supplementary Note F1); for example, the Zona pellucida- 
like domain (notably found in sperm-binding proteins’) is present in 
an average of 36 copies in metazoan genomes but is absent in A. vaga. 
In contrast, we found multiple copies of most metazoan genes involved 
in DNA repair and homologous recombination, including a consid- 
erably divergent Spol1 but no Rad52 and Msh3. 

To conclude, our analysis of a lineage of the bdelloid rotifer Adineta 
vaga reveals positive evidence for asexual evolution: its genome struc- 
ture does not allow pairing of homologous chromosomes and there- 
fore seems incompatible with conventional meiosis (Fig. 5). However, 
we cannot rule out that other forms of recombination occur in bdelloid 
populations in ways that do not require homologous pairing, such as 
parasexuality*®. The high number of horizontally acquired genes, 
including some seemingly recent ones, suggests that HGT’s may also 
be occurring from rotifer to rotifer. It is plausible that the repeated 
cycles of desiccation and rehydration experienced by A. vaga in its 
natural habitats have had a major role in shaping its genome: desic- 
cation presumably causes DNA double-strand breaks, and these 
breaks that allow integration of horizontally transferred genetic mater- 
ial also promote gene conversion when they are repaired. Hence, the 
homogenizing and diversifying roles of sex may have been replaced in 
bdelloids by gene conversion and horizontal gene transfer, in an unex- 
pected convergence of evolutionary strategy with prokaryotes. 


METHODS SUMMARY 


Genomic DNA was extracted from laboratory cultures of a clonal A. vaga lineage 
and shotgun-sequenced using 454 and Illumina platforms at respective coverage of 
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colinearly on homologous chromosomes). In the ameiotic genome of A. vaga 
(right), alleles are distributed in blocks that are shuffled across chromosomes, 
resulting notably in intrachromosomal repeats (direct or inverted). As a 
consequence, chromosomes have no homologues and cannot be paired. 


25 and 440 times (using both single reads and mate reads from inserts up to 20 kb). 
The 454 reads were assembled into contigs using MIRA”; the contigs obtained 
were corrected using single Illumina reads and linked into scaffolds using paired 
Illumina reads** (Supplementary Table 1). We annotated protein-coding genes by 
integrating evidence from RNA sequencing, ab initio predictions and comparison 
with UniProt. Most synteny and Ka/Ks (non-synonymous divergence/synonymous 
divergence) analyses were performed using the package MCScanX” and synteny 
plots were drawn using Circos”. 


Received 21 November 2012; accepted 30 May 2013. 
Published online 21 July; corrected online 21 August 2013 (see full-text HTML 
version for details). 


1. Danchin, E. G. J., Flot, J.-F., Perfus-Barbeoch, L. & Van Doninck, K. In Evolutionary 

Biology— Concepts, Biodiversity, Macroevolution and Genome Evolution (ed. 

Pontarotti, P.) 223-242 (Springer, 2011). 

2. Hsu, W. S. Oogenesis in the Bdelloidea rotifer Philodina roseola Ehrenberg. Cellule 

57, 283-296 (1956). 

3. Davis, H. A new Callidina: with the result of experiments on the desiccation of 

rotifers. Month. Microscopical J. 9, 201-209 (1873). 

4. Segers, H. Annotated checklist of the rotifers (Phylum Rotifera), with notes on 

nomenclature, taxonomy and distribution. Zootaxa 1564, 1-104 (2007). 

5: aynard Smith, J. Contemplating life without sex. Nature 324, 300-301 (1986). 

6. Ricci, C. Anhydrobiotic capabilities of bdelloid rotifers. Hydrobiologia 387-388, 

321-326 (1998). 

7.  Gladyshev, E. & Meselson, M. Extreme resistance of bdelloid rotifers to ionizing 

radiation. Proc. Natl Acad. Sci. USA 105, 5139-5144 (2008). 

8. Hur, J.H., Van Doninck, K., Mandigo, M. L. & Meselson, M. Degenerate tetraploidy 

was established before bdelloid rotifer families diverged. Mol. Biol. Evol. 26, 

375-383 (2009). 

9. Rozen,S.eta/. Abundant gene conversion between arms of palindromes in human 
and ape Y chromosomes. Nature 423, 873-876 (2003). 

0. Stephens, P. J. et al, Massive genomic rearrangement acquired in a single 
catastrophic event during cancer development. Cel/ 144, 27-40 (2011). 

11. Vijg,J.& Dollé, M. E. T. Large genome rearrangements as a primary cause of aging. 

Mech. Ageing Dev. 123, 907-915 (2002). 

12. Birky, C. W. Jr. Heterozygosity, heteromorphy, and phylogenetic trees in asexual 

eukaryotes. Genetics 144, 427-437 (1996). 

13. Leffler, E.M. et al. Revisiting an old riddle: what determines genetic diversity levels 

within species? PLoS Biol. 10, €1001388 (2012). 

14. Welch, D. B. M. & Meselson, M. S. Rates of nucleotide substitution in sexual and 
anciently asexual rotifers. Proc. Nat! Acad. Sci. USA 98, 6720-6724 (2001). 

5. Teshima, K. M. & Innan, H. The effect of gene conversion on the divergence 
between duplicated genes. Genetics 166, 1553-1560 (2004). 

6. Gladyshev, E.A., Meselson, M. & Arkhipova, |. R. Massive horizontal gene transfer in 
bdelloid rotifers. Science 320, 1210-1213 (2008). 

7. Syvanen, M. Evolutionary implications of horizontal gene transfer. Annu. Rev. 
Genet. 46, 341-358 (2012). 

8. Hickey, D. A. Selfish DNA: a sexually-transmitted nuclear parasite. Genetics 101, 
519-531 (1982). 

9. Arkhipova, |. & Meselson, M. Deleterious transposable elements and the extinction 
of asexuals. Bioessays 27, 76-85 (2005). 

20. Abad, P. et al. Genome sequence of the metazoan plant-parasitic nematode 

Meloidogyne incognita. Nature Biotechnol. 26, 909-915 (2008). 


©2013 Macmillan Publishers Limited. All rights reserved 


21. Gladyshev, E. A. Meselson, M. & Arkhipova, |. R. A deep-branching clade of 
retrovirus-like retrotransposons in bdelloid rotifers. Gene 390, 136-145 (2007). 

22. Shirayama, M. etal. piRNAs initiate an epigenetic memory of nonself RNA in the C. 
elegans germline. Cel/ 150, 65-77 (2012). 

23. Krisko, A., Leroy, M., Radman, M. & Meselson, M. Extreme anti-oxidant protection 
against ionizing radiation in bdelloid rotifers. Proc. Nat! Acad. Sci. USA 109, 
2354-2357 (20 

24. Geisler-Lee, J. et al. Poplar carbohydrate-active enzymes. Gene identification and 
expression analyses. Plant Physiol. 140, 946-962 (2006). 

25. Bork, P. & Sander, C. A large domain common to sperm receptors (Zp2 and Zp3) 
and TGF-B type Ill receptor. FEBS Lett. 300, 237-240 (1992). 

26. Forche, A. et al. The parasexual cycle in Candida albicans provides an alternative 
pathway to meiosis for the formation of recombinant strains. PLoS Biol. 6, e110 
(2008). 

27. Chevreux, B., Wetter, T. & Suhai, S. Genome sequence assembly using trace signals 
and additional sequence information. Proc. German Conf. Bioinf. 99, 45-56 (1999). 

28. Boetzer, M., Henkel, C. V., Jansen, H. J., Butler, D. & Pirovano, W. Scaffolding pre- 
assembled contigs using SSPACE. Bioinformatics 27, 578-579 (2011). 

29. Wang, Y. et al. MCScanx: a toolkit for detection and evolutionary analysis of gene 
synteny and collinearity. Nucleic Acids Res. 40, e49 (2012). 

30. Krzywinski, M. et al. Circos: An information aesthetic for comparative genomics. 
Genome Res. 19, 1639-1645 (2009). 


Nh 
LS 


Supplementary Information is available in the online version of the paper. 


Acknowledgements The authors would like to thank M. Meselson for his support 
during the initiation phase of this project and for inspiring us with his seminal works on 
bdelloid genetics. The authors are also grateful to M. Radman for useful discussions, 
M. Knapen and N. Debortoli for participating in laboratory work, M. Lliros for helping 
with Fig. 1, S. Henrissat for participating in CAZyme analyses, and S. Oztas, B. Vacherie, 
P. Lenoble and S. Mangenot for performing PCR validations of the assembly. This work 
was supported by Genoscope-CES (where most of the sequencing was performed), by 
US National Science Foundation grants MCB-0821956 and MCB-1121334 to |.A, by 
German Research Foundation grant HA 5163/2-1 to O.H., by grant 11.G34.31.0008 
from the Ministry of Education and Science of the Russian Federation to A.S.K., by grant 
NSF CAREER number 0644282 to M.K,, by US National Science Foundation grant 
MCB-0923676 to D.B.M.W., by FRFC grant 2.4.655.09.F from the Belgian Fonds 
National de la Recherche Scientifique (FNRS) and a start-up grant from the University 


LETTER 


of Namur to K.V.D.; J.F.F. and K.V.D. thank also J.-P. Descy (University of Namur) for 
funding support. 


Author Contributions Bo.H., X.L., and B.N. are joint second authors; OJ. and K.V.D. 
are joint last authors. Bo.H., X.L., F.R. and B.H.L. maintained the rotifer cultures; Bo.H., 
XL, F.R. and B.H.L. prepared the genomic DNA; X.L., D.B.M.W. and B.H.L. carried out 
gene expression experiments; Bo.H., X.L. and B.H.L. prepared complementary DNAs; 
K.L, J.P. and B.H.L. carried out the sequencing; J.F.F., A.C., V.B., OJ., B.N., J.M.A. and 
C.D.S. assembled the genome, validated the assembly and built the gene set; J.F.F., 
J.M.A,, V.B., G.A.B., M.D.R., E.G.J.D., O.A-V., M.K., P.W., O.J. and K.V.D. analysed the 
genome structure; Bo.H., E.G.J.D., M.D.R., J.F.F., A.H., Be.H., B.H.L, R.K., B.L, J.F-R., F-R., 
AS.K., E.W., D.B.M.W. and K.V.D. analysed the gene families; |.A., J.B., O.P. and LY. 
annotated and analysed the transposable elements; O.C., P.G., B.W., R.B., P.P.and K.V.D. 
carried out orthology analysis; |.A., E.G. E.G.J.D., P.G., B.W., F.R., D.B.M.W., P.P., J.F.F. 
and O.J. analysed the horizontal gene transfers; O.A.V., J.F.F., G.A.B., A.S.K. and 
D.B.M.W. analysed the signatures of gene conversion; O.H. modelled the effect of 
gene conversion on Muller’s ratchet; J.F.F., OJ. and K.V.D. wrote the core of the 
manuscript, with contributions from I.A., E.G.J.D., A.H., B.N., O.H., Be.H., Bo.H., R.K., 
J.MLA,, J.F.R., O.AV., M.K., A.S.K., D.B.M.W., P.P. and P.W.; and P.W., J.W., 

R.B., D.B.M.W., P.P., OJ. and K.V.D. designed the project and 

acquired funding. 


Author Information The sequencing reads and assembly are available at the Sequence 
Read Archive (accessions ERP002115 and SRP020364 for DNA, ERP002474 and 
SRPO20358 for cDNA) and at the European Nucleotide Archive (accession 
CAWIO00000000), respectively. The assembly and annotation can be browsed and 
downloaded at http://www.genoscope.cns.fr/adineta, whereas the result of the 
orthology analysis is accessible at http://ioda.univ-provence.fr/. Reprints and 
permissions information is available at www.nature.com/reprints. The authors declare 
no competing financial interests. Readers are welcome to comment on the online 
version of the paper. Correspondence and requests for materials should be addressed 
to O.J. (ojaillon@genoscope.cns.fr or ojaillon@mit.edu), J.F-F. 
(jean-francois.flot@ds.mpg.de) or K.V.D. (karine.vandoninck@fundp.ac.be). 


oS OOO) This work is licensed under a Creative Commons Attribution- 
aaa NonCommercial-Share Alike 3.0 Unported licence. To view a copy of this 
licence, visit http://creativecommons.org/licenses/by-nc-sa/3.0 


22 AUGUST 2013 | VOL 500 | NATURE | 457 


©2013 Macmillan Publishers Limited. All rights reserved 


| sid ial Be 


doi:10.1038/nature12330 


Oxytocin enhances hippocampal spike transmission 
by modulating fast-spiking interneurons 


Scott F. Owen!, Sebnem N. Tuncdemir?, Patrick L. Bader!?, Natasha N. Tirko”, Gord Fishell? & Richard W. Tsien’? 


Neuromodulatory control by oxytocin is essential to a wide range of 
social’”, parental’ and stress-related behaviours*. Autism spectrum 
disorders (ASD) are associated with deficiencies in oxytocin levels° 
and with genetic alterations of the oxytocin receptor (OXTR)*. Thirty 
years ago, Miihlethaler et al.” found that oxytocin increases the 
firing of inhibitory hippocampal neurons, but it remains unclear 
how elevated inhibition could account for the ability of oxytocin to 
improve information processing in the brain. Here we describe in 
mammalian hippocampus a simple yet powerful mechanism by 
which oxytocin enhances cortical information transfer while simul- 
taneously lowering background activity, thus greatly improving the 
signal-to-noise ratio. Increased fast-spiking interneuron activity not 
only suppresses spontaneous pyramidal cell firing, but also enhances 
the fidelity of spike transmission and sharpens spike timing. Use- 
dependent depression at the fast-spiking interneuron-pyramidal 
cell synapse is both necessary and sufficient for the enhanced spike 
throughput. We show the generality of this novel circuit mechanism 
by activation of fast-spiking interneurons with cholecystokinin or 
channelrhodopsin-2. This provides insight into how a diffusely 
delivered neuromodulator can improve the performance of neural 
circuitry that requires synapse specificity and millisecond precision. 

The CAI region of hippocampus receives potent excitatory input from 
neighbouring area CA3 through the Schaffer Collateral (SC) pathway. 
Activation of SC axons evokes a monosynaptic excitatory postsynaptic 
potential (EPSP) onto CA1 pyramidal cells, as well as exciting a variety 
of CA1 interneurons. These interneurons then deliver a millisecond- 
delayed inhibitory postsynaptic potential (IPSP), termed feed-forward 


inhibition. Thus, both the stimulation threshold and the timing of 
spikes evoked in CA1 pyramidal cells by SC activation are dictated 
by a finely tuned balance of monosynaptic excitatory and disynaptic 
inhibitory inputs*”. 

In agreement with previous results®, we found that stimulation of 
the SC pathway in acute rat hippocampal slices evoked spikes with a 
short latency and moderate jitter (Fig. 1a). Strikingly, bath application 
of TGOT (Thr*,Gly’-oxytocin, 200 nM), a specific agonist for oxytocin 
receptors, dramatically increased the probability of evoking a spike in 
the postsynaptic neuron from 0.50 to 0.82, while simultaneously sup- 
pressing the spontaneous activity of CA1 pyramidal cells by 57% from 
1.4 Hz to 0.6 Hz (Fig. la-d). The combination of increased evoked spike 
probability (signal) and reduced spontaneous activity (noise) resulted 
in an enhanced signal-to-noise ratio. TGOT also reduced the latency 
and increased the temporal precision of evoked spikes (Fig. Le, f). 

TGOT increased the rate and amplitude of spontaneous inhibitory 
postsynaptic currents (IPSCs) onto CA1 pyramidal cells, as previously 
described”"” (Fig. 1g, Supplementary Fig. 1). Blockade by 10 [1M bicucul- 
line or by 100 nM tetrodotoxin indicated that these events were mediated 
by GABA, receptors and probably required an increase in interneuron 
firing rather than a change in spontaneous presynaptic release. The 
specific oxytocin receptor antagonist OTA ((d(CH2) 5 )Tyr(Me)*,Thr*, 
Orn®,des-Gly-NH,”)-vasotocin, 1uM) blocked the TGOT-induced 
effects, indicating that these actions were solely mediated by the oxytocin 
receptor’’. The TGOT-induced increase in spontaneous IPSCs was also 
abolished by the potent P/Q-type calcium channel blocker w-Agatoxin 
IVA, but unaffected by the N-type calcium channel antagonist «-Conotoxin 
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Figure 1 | Oxytocin receptor agonist (TGOT) reduces spontaneous firing 
but enhances EPSP-spike coupling in CA1 pyramidal neurons. a, Exemplar 
CA1 pyramidal cell-attached recording of spikes (red). Vertical bar indicates 
Schaffer Collateral stimulus. b, Time course and ¢, d, group data of evoked spike 
probability (n = 15 cells) and spontaneous activity (n = 23 cells) in pyramidal 
cell-attached recordings as influenced by 200 nM TGOT. e, Exemplar recording 
demonstrating TGOT reduction in evoked spike latency and jitter. 


f, Cumulative distribution of raw (solid) and mean-subtracted (dashed) spike 
times (n = 15 cells). Mean latency: 5.87 + 0.42 ms; mean variance: 

2.29 + 0.30 ms’. g, Spontaneous IPSC frequency onto CA1 pyramidal cells 

(n = 6 cells, each condition). Paired two-tailed t-test in panels c, d and g. Two- 
sample Kolmogoroy-Smirnov (K-S) test in panel f. *P < 0.05; **P<0.01; 
*** D < 0.001. Error bars, s.e.m. 


1Department of Molecular and Cellular Physiology, 279 Campus Drive, Stanford University School of Medicine, Stanford, California 94305, USA. ®NYU Neuroscience Institute, New York University, 450 East 


29th Street, New York, New York 10016, USA. 


458 | NATURE | VOL 500 | 22 AUGUST 2013 


©2013 Macmillan Publishers Limited. All rights reserved 


GVIA, indicating that these events probably arise primarily from fast- 
spiking interneurons (FSIs) with little contribution from regular-spiking 
(RS) interneurons (Fig. 1g, Supplementary Fig. 1)'*”’. 

To test more directly whether TGOT precisely targeted FSI subtypes, 
we used whole cell recordings in CA1 strata oriens and pyramidale, as 
stratum radiatum interneurons are unresponsive to TGOT" and lack 
OXTR expression’’. We found a clear distinction: FSIs were responsive 
to TGOT, whereas RS interneurons were not (Fig. 2a). FSIs displayed 
robust responses upon application of 20 and 200nM TGOT 
(Supplementary Fig. 2a—c), the latter producing a near-saturated effect. 
Dividing the increase in IPSCs onto pyramidal cells (27.3 Hz, Fig. 1g) 
by the increase in FSI firing rate (8.8 Hz per FSI, Fig. 2a), we calculate 
that, on average, each pyramidal cell receives input from at least ~3.1 
TGOT-responsive FSIs in our slices. 

To clarify mechanisms by which TGOT depolarizes FSIs, we voltage 
clamped FS perisomatic-targeting (basket and axo-axonic) and RS basket 
cells at —65 mV. TGOT induced a large inward current in FSIs (Fig. 2b, 
Supplementary Fig. 2g), but as expected had no effect on the RS cells 
(data not shown). TGOT also increased the rate of spontaneous inhibi- 
tory postsynaptic currents (IPSCs) onto FSIs (Supplementary Fig. 2d-f), 
as predicted from the FSI-FSI connectivity that may serve to regulate 
the distribution and extent of inhibition. 

To test whether the TGOT-induced inward current arises from G 
protein signalling within the FSI itself, we replaced the GTP in the 
intracellular recording solution with 1 mM GTPYS, a non-hydrolysable 
GTP analogue that renders G proteins constitutively active. Action of 
GTPYS in inducing inward current largely occluded the effect of TGOT 
(Fig. 2b, Supplementary Fig. 2g), verifying that the TGOT effects involve 
G protein signalling within the recorded neuron. The amplitude and 
kinetics of the TGOT-induced current were unaffected by intracellular 
BAPTA, indicating that the intracellular signalling mechanism is prob- 
ably not Ca**-dependent". In voltage ramp recordings from FSls, the 
TGOT-induced current reversed at —3.1 + 3.4 mV (Fig. 2c, Supplemen- 
tary Fig 2h), suggesting that the currents are generated by a non-selective 
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cation channel. Partial replacement of external sodium by NUDG (50 mM) 
shifted the reversal potential to more negative values (— 13.8 + 3.7 mV, 
P<0.05 unpaired two-tailed t-test, data not shown), pointing to Na~ 
as the predominant charge carrier of the TGOT-induced inward current. 

To investigate the mechanisms of the enhanced fidelity of spike trans- 
mission, we obtained whole cell current clamp recordings from CA1 
pyramidal cells and elicited spikes synaptically or by current injection 
on interleaved trials (Fig. 2d, e, Supplementary Figures 3 and 4a). 
TGOT increased the fidelity of synaptically evoked spikes in whole cell 
mode, paralleling its effect in cell-attached recordings (Fig. 1), but 
reduced the probability of evoking spikes by whole cell current injec- 
tion. This apparent reduction in pyramidal cell excitability was coupled 
to a hyperpolarization of the cell membrane (Supplementary Fig. 4b). 
As TGOT had no effect on the holding current or membrane resistance 
in voltage clamp recordings of pyramidal neurons in the presence of 
bicuculline (Supplementary Fig. 4c), we concluded that the reduction 
in spontaneous activity and excitability was wholly attributable to 
enhanced inhibitory tone. This increase in inhibitory tone, however, 
made the enhanced EPSP-spike coupling all the more surprising. 

We speculated that the enhanced EPSP-spike coupling might arise 
from a shift in the synaptic excitatory—inhibitory balance. Indeed, the 
disynaptic IPSP was reduced by TGOT (Supplementary Fig. 4g, h), and 
bicuculline abolished the TGOT-induced increase in evoked spike 
probability in cell-attached recordings (Fig. 2f). To most rigorously 
isolate inhibitory inputs, we stimulated the Schaffer Collateral pathway 
while holding the cell at 0 mV under voltage clamp, and found that the 
evoked disynaptic IPSC was reduced by TGOT (Fig. 2g, h). In contrast, 
the evoked excitatory postsynaptic current (EPSC), isolated at —-65 mV 
in the presence of bicuculline, was unaffected. This selective reduction 
of the evoked IPSC while sparing the EPSC, shifts the excitatory- 
inhibitory (E-I) balance and could account for the increase in evoked 
spike probability. This reduction in feed-forward inhibition could arise 
from either a reduction in excitatory to inhibitory (E—]) transmission, 
causing fewer interneurons to be activated, or a reduction in inhibitory 


Figure 2 | TGOT activates FSIs and suppresses 
feed-forward inhibition. a, TGOT influence on 
membrane potential (filled symbols) and firing rate 
(open symbols). Exemplar electrophysiological 
identification of interneurons, inset. Above, 
exemplar biocytin-filled interneurons tracings 
(soma and dendrites, black; axon, green; stratum 
pyramidale, grey area). Oriens-lacunosum 
moleculare (O-L.M.; n = 3), RS perisomatic- 
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(n = 6 cells). f, Cell-attached, synaptically-evoked 
h spike probability in bicuculline (n = 6 cells, 
probability: P > 0.95; latency: P > 0.5; latency 
variance: P > 0.3). Reduced stimulus strength was 
sufficient to reach 50% spike transmission and 
spikes occurred at longer latencies and with more 
jitter than in control ACSF*. g, TGOT influence on 
average evoked disynaptic IPSC from one 
pyramidal cell and monosynaptic EPSC from a 
different pyramidal cell. h, Normalized group data 
for evoked EPSC (n = 6 cells) and disynaptic IPSC 
(n = 8 cells). Paired two-tailed t-test, all panels 
except panel d, which uses one-tailed t-test for 
compatibility with cell-attached results. *P < 0.05, 
**P < 0.01, ***P < 0.001. Error bars, s.e.m. 
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to excitatory (IE) transmission, causing each interneuron to be less 
effective. We recorded from FSIs while stimulating the Schaffer Collateral 
pathway but found no effect of TGOT on EI transmission (Sup- 
plementary Fig. 4i, j). In contrast, stimulating the stratum pyramidale 
while blocking excitatory transmission with the AMPA and NMDA 
receptor antagonists NBQX and AP5 revealed a TGOT-induced suppres- 
sion of I>E transmission onto pyramidal cells (Supplementary Fig. 4k, 1). 

Using paired whole cell recordings, we investigated how TGOT 
reduces the evoked IPSC at the IE synapse. TGOT increased the spon- 
taneous firing of presynaptic FSIs and also diminished the FSI-pyramidal 
cell unitary IPSC, without affecting RS interneurons (Fig. 3a, b). When 
the TGOT-induced depolarization of the presynaptic FSI was countered 
with a hyperpolarizing bias current, however, the spontaneous firing in 
the presynaptic cell ceased and the unitary evoked IPSC was maintained 
at its pre- TGOT amplitude. This rescue suggests that TGOT induces a 
use-dependent depression of the IPSC’*"®, and that the increase in 
spontaneous FSI firing is necessary for the reduction in the evoked 
feed-forward IPSC. 

To test whether the TGOT-induced increase in the FSI firing rate 
was sufficient to account for the observed synaptic depression, we drove 
trains of action potentials 10s in duration in the absence of TGOT 
(Fig. 3c). The frequency dependence of the residual IPSC following a 
10s train in control artificial cerebrospinal fluid (ACSF) (Fig. 3d, 
coloured diamonds) matched closely with that of the residual IPSC in 
TGOT (Fig. 3d, black symbols). Thus, the TGOT-mediated increase in 
FSI spontaneous firing is not only necessary (Fig. 3a, b), but also suffi- 
cient (Fig. 3c, d) to account for the observed decrease in evoked IPSC 
amplitude (Fig. 2g, h), and enhancement of EPSP-spike coupling 
(Fig. 1). Recovery of the IPSC was nearly complete by 4.5 s following 
the 50 Hz train, consistent with a rapid switching of the FS synapses 
between baseline and depressed states’® (Fig. 3e). We also observed a 
modest, frequency-dependent increase in the spike width over the 10s 
trains (Supplementary Fig. 5) that would be expected, if anything, to 
increase presynaptic release, contrary to the depression that was observed. 

The specificity of TGOT for FSIs suggested that this mechanism may 
bea general property of this network (Supplementary Fig. 6a), and that 
any peptide, network state, or signal that increases the spontaneous acti- 
vity of FSIs will also increase the fidelity of spike transmission. We tested 
this hypothesis using two independent approaches, first stimulating 


FSIs with the peptide cholecystokinin (CCK), and second, targeting 
this population with the light-activated ion channel channelrhodop- 
sin-2 (ChR2). 

CCK activates FS basket cells"’, transiently increasing their firing 
rate in a manner reminiscent of TGOT. In close agreement with our 
TGOT results, CCK enhanced inhibitory tone and suppressed the evoked 
feed-forward IPSC without affecting the evoked EPSC (Fig. 4a, b, 
Supplementary Fig. 6b-e). In cell-attached recordings, CCK increased 
the probability of evoking spikes in CA1 pyramidal cells by Schaffer 
Collateral stimulation, while simultaneously suppressing the spontan- 
eous firing of these cells (Fig. 4c, d). Furthermore, both the latency and 
the jitter of the evoked spikes were reduced by CCK (Supplementary 
Fig. 6f), just as they were with TGOT (Fig. 1f). 

We then used ChR2 to selectively activate FSIs in acute hippocampal 
slices from PV-Cre BAC transgenic mice. Immunostaining confirmed 
that the ChR2 was efficiently targeted to the parvalbumin-expressing 
(PV*) FSIs (Supplementary Fig. 7). Optogenetic activation of FSIs 
induced IPSCs in CA1 pyramidal cells that showed a strong synaptic 
depression (Fig. 4e), consistent with our paired recording data (Fig. 3), 
and with previous reports’®'®. In agreement with our TGOT and CCK 
results, driving FSIs with a brief train of blue light pulses preceding an 
electrical stimulus to the Schaffer Collateral pathway modestly increased 
the probability of eliciting a spike in a postsynaptic pyramidal cell, 
relative to interleaved control trials in which the blue light was omitted 
(Fig. 4f, Supplementary Fig. 8a). Examination of the spike latency and 
jitter, however, revealed a high probability that monosynaptic inhibi- 
tion contaminated a subset of these recordings. When the recordings 
with the shortest latency and lowest jitter were excluded (Supplemen- 
tary Fig. 8d-f, see Methods), the remaining neurons all exhibited a 
pronounced increase in evoked spike probability following ChR2 sti- 
mulation (Fig. 4f, g). Taken together, three interventions, TGOT, CCK 
and ChR2, therefore all converge on a single surprising conclusion: that 
activation of FSIs enhances the fidelity of spike transmission in the 
hippocampus. TGOT also increased the evoked population spike ampli- 
tude in the presence of kainate-induced gamma rhythms (Supplemen- 
tary Fig. 9), thus confirming that the TGOT-induced enhancement in 
EPSP-spike coupling is robust under more in vivo-like conditions. 

We constructed a minimal computational model to investigate the 
mechanisms linking FSI activation to the evoked spike probability, 
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Figure 3 | Paired recordings reveal synaptic locus of TGOT-induced 
decrease in evoked inhibition. a, Presynaptic interneurons (upper) and 
postsynaptic pyramidal cells (lower). Individual sweeps in grey, average in 
black. Presynaptic FSI permitted to depolarize in TGOT (top). TGOT 
depolarization of presynaptic FSI countered by current injection (middle). RS 
interneuron transmission unaffected by TGOT (bottom). b, FSI-pyramidal 
synapses depress only when FSI is depolarized by TGOT. FSI, n = 8; no 
depolarization, n = 5; RS, n = 4. c, Frequency-dependent depression of 
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FSI-pyramidal cell synapses in control ACSF. Averages normalized to first 
IPSC (2 Hz, 10 Hz, n = 7 cell pairs; 1 Hz, 5 Hz, 20 Hz, 50 Hz, n = 8 cell pairs). 
d, FSI-pyramidal synaptic depression from TGOT-induced firing (black 
circles, one point per cell pair) matches depression by 10s spike trains in 
control ACSF (coloured diamonds, average from multiple cell pairs). No black 
circles obscured by coloured diamonds. e, Synaptic recovery following 50 Hz, 
10s train. (n = 8 cells). Paired two-tailed t-test. *P < 0.05. NS, not significant 
P>0.15. Error bars, s.e.m. 
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Figure 4 | Generalization to other brain states and modulators. a, CCK 
(200 nM) influence on average evoked disynaptic IPSC from one pyramidal cell 
and monosynaptic EPSC from a different pyramidal cell. b, Normalized group 
data for evoked EPSC (n = 6 cells) and disynaptic IPSC (n = 6 cells). c, Evoked 
spike probability (n = 14 cells) and d, spontaneous firing rate (n = 14 cells) in 
cell-attached recordings as influenced by CCK. e, Exemplar ChR2-evoked 
IPSCs recorded ina CA1 pyramidal neuron from a PV-Cre mouse injected with 
double-floxed AAV-ChR2. f, Cell-attached recording from mouse CA1 
pyramidal neuron. Control (LED off) and ChR2 stimulation (LED on) sweeps 
interleaved during recording, but grouped for presentation. g, ChR2 influence 
on cell-attached spike probability in the subset of mouse pyramidal neurons in 
which latency and jitter indicated a minimal monosynaptic inhibition (see 
Methods). h, Computer simulated exemplar traces in which IPSC conductance 
(gipsc) is reduced and EPSC conductance (ggpsc) is either lowered to maintain 
~50% chance of spiking (top) or gepsc is held constant (bottom). i, Residual 
IPSC influence on simulated spike latency with grpsc held constant (red) or 
reduced to maintain ~50% spike probability (grey). Panels h and i generated in 
absence of spontaneous IPSCs to isolate feed-forward IPSC contribution to 
evoked spike timing. j, Residual IPSC influence on probability of eliciting 
exactly one spike (ggpsc held constant). Paired two-tailed t-test. *P < 0.05; 
**D < (01; ***P < 0.001. Error bars, s.e.m. 


latency and jitter. We mimicked the enhanced FSI activity by increas- 
ing the rate and amplitude of spontaneous IPSCs. The synaptic depres- 
sion at the FSI-pyramidal synapse was simulated by reducing the evoked 
IPSC to 60% of its basal value. In agreement with our experimental 
results, these changes reduced the simulated evoked IPSP, increased 
the simulated evoked spike probability, and sharpened the evoked 
spike timing (Supplementary Fig. 10). 

We then asked why a decrease in feed-forward inhibition shrinks 
evoked spike latency and jitter, in apparent conflict with the idea that 
feed-forward inhibition enforces sharp spike timing*’. Resolution is 
achieved by considering how the EPSC and IPSC conductances (grpsc 
and gipsc) regulate membrane voltage near the spike-firing threshold. 
A reduction in grpsc allows an unaltered gppsc to push the membrane 
potential up to the spike firing threshold more reliably and more quickly 
and precisely (Fig. 4h, i). In contrast, if gupsc is reduced to nearly the 
same degree as gipsc in order to clamp the likelihood of spike firing®, 
the latency and jitter are increased. 

Finally, we probed the functional consequences of the strikingly 
incomplete depression of the FS synapses (Fig. 3c) and the effects of 
varying the latency between the onset of gzpsc and gipsc. Fidelity of spike 
transmission (defined as the fraction of sweeps containing precisely one 
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postsynaptic spike) is maximal when IPSCs are depressed by approxi- 
mately 40% (Fig. 4j), the value we observed experimentally in response 
to either TGOT or CCK application (Figs 2g, h and 4a, b). Likewise, a 
residual IPSC of 50-60% was optimal in considerations of global spike 
jitter (Supplementary Fig. 10h). Thus, the empirically observed TGOT 
response in FSIs seems well suited in multiple respects to the efficient 
retuning of overall circuit performance. 

Our experiments reveal a generalized mechanism through which 
oxytocin improves the fidelity and temporal precision of information 
transfer through brain networks. Oxytocin enhanced circuit perform- 
ance in three ways: increasing throughput of output spikes, sharpening 
submillisecond spike timing, and suppressing background firing. Each 
of these improvements in circuit signal-to-noise ratio could be traced 
to the action of oxytocin on FSIs and reproduced through quantitative 
simulations, as well as through other interventions that specifically activ- 
ate FSIs. 

The rapid onset and recovery of FSI use-dependent synaptic depres- 
sion is well suited to shift circuit dynamics rapidly yet stably in res- 
ponse to oxytocin, whether delivered quickly and focally, as in synaptic 
release’, or presented diffusely at low doses, as in volume transmission. 
The partial depression of FSI synapses (residual, ~35%) and the spar- 
ing of RS interneurons ensures that modulation by oxytocin avoids the 
dangers associated with a complete loss of inhibition such as dramati- 
cally impaired spike timing precision® (Supplementary Figs 10 and 11) 
and epileptogenesis. 

Our experiments provide a circuit mechanism linking three dispar- 
ate aspects of ASD’’. Oxytocin signalling has been implicated in ASD 
by genetic analysis and pharmacological studies*®’*"’. PV-positive 
FSIs are important in autism aetiology”’, presumably due to their role 
in excitation-inhibition balance and neuronal oscillations, both of 
which are likely impaired in ASD. Deficiencies in signal-to-noise ratio, 
observed as unreliable cortical evoked potentials in ASD’, offer a 
valuable endophenotype, but have not yet been linked to a circuit 
defect or a therapeutic strategy. Tying these aspects together, our find- 
ing that FSIs are direct targets of oxytocin and can potently modulate 
circuit signal-to-noise ratio, shows these cells may be uniquely poised 
to counteract deficits in rapid information processing in psychiatric 
disorders'*”°*". In healthy individuals, oxytocin signalling through 
FSIs may provide a salience cue, capable of transiently enhancing 
cognitive performance’*””. Indeed, increasing PV~ interneuron activ- 
ity was sufficient to recover hippocampal-dependent behavioural def- 
icits in a mouse model of Alzheimer’s disease**. There may be parallels 
in the visual cortex as well, where optogenetic activation of PV* inter- 
neurons operates like a salience cue and sharpens orientation tuning™. 

The selective action of oxytocin on FSIs, amidst the wide variety of 
interneuron types, raises questions about functional logic. Specific tar- 
geting of FSIs may be geared toward altering network function through 
fine-tuning of feed-forward inhibition. Importantly, the FSIs engaged 
by oxytocin are physiologically and functionally distinct from RS inter- 
neurons, which play a major role in feed-back inhibition and whose 
output is regulated by endocannabinoids”. By selectively targeting 
distinct interneuron populations, neuromodulators like oxytocin and 
endocannabinoids could be specialized for sculpting different forms of 
inhibition. 

Another modulator, noradrenaline, enhances circuit signal-to-noise 
ratio in slice and in vivo through a variety of mechanisms across 
multiple brain regions including the hippocampus**” and auditory 
system”. In auditory brainstem, Kuo and Trussell described how nor- 
adrenaline suppresses cartwheel inhibitory neuron spiking, relieving 
their output synapses from tonic depression”’. Although this mech- 
anism differs from ours in direction of change and functional outcome, 
an emergent general principle is that modulation of inhibitory neuron 
tonic firing and variation in use-dependent synaptic depression can 
regulate signal-to-noise. In the hippocampus, several monoamine res- 
ponses have been delineated across excitatory and inhibitory neurons 
that enhance circuit signal-to-noise ratio**””*°. Although oxytocin and 
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noradrenaline both enhance the signal-to-noise ratio of CA1 pyr- 
amidal neurons, the widespread effects of noradrenaline contrast shar- 
ply with the exquisitely focused mechanism we uncovered. Oxytocin 
accomplishes both the enhanced fidelity of spike transmission and the 
suppression of background activity by selectively targeting a single 
locus: FSI activity. Furthermore, FSI synaptic depression in hippocam- 
pal CA1 (Fig. 3) is representative of that in dentate gyrus”®, cortex", 
and elsewhere, indicating that similar modulation of signal-to-noise 
ratio by FSI activity may be essential in many brain regions. 


METHODS SUMMARY 


A full description of materials and Methods including slicing procedure, recording 
methodology, drugs, reagents, mouse lines, viruses, interneuron labelling and classi- 
fication and computer modelling is available in the Supplementary Information. 
Briefly, acute hippocampal slices (350-um thick) were prepared from Sprague- 
Dawley rats aged p21-p28 of either gender. Gender of animals did not significantly 
influence the effect of TGOT on EPSP-spike coupling or on interneuron depola- 
rization (Supplementary Fig. 13). For optogenetics experiments (Fig. 4), acute 
hippocampal slices (300-u1m thick) were prepared for recording from PV-Cre mice 
3-5 weeks following virus injection. Modelling was performed using NEURON 
(http://www.neuron.yale.edu/neuron/) and MATLAB. All protocols were approved 
by the Institutional Animal Care and Use Committee of Stanford University. 


Full Methods and any associated references are available in the online version of 
the paper. 
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METHODS 


Slice preparation. Rat hippocampal slices (350-j1m thick) were prepared using a 
Leica VT 1000S vibratome from p21—p28 Sprague-Dawley rats of either sex in ice- 
cold sucrose slicing solution containing (in mM) 206 Sucrose, 11 D-Glucose, 2.5 
KCl, 1 NaH2POq, 10 MgCl, 2 CaCl, and 26 NaHCOs. Rats were anaesthetized 
with isofluorane inhalation before decapitation and dissecting out of the hippo- 
campus. Mouse transverse hippocampal slices (300-j1m) were prepared using a 
Vibratome 1000 plus (Vibratome). Mice were deeply anaesthetized with intraper- 
itoneal injection of pentobarbital (100 mg per kg body weight) and then transcar- 
dially perfused with ~30 ml ice-cold sucrose-ACSF solution containing (in mM) 
252 sucrose, 24 NaHCOs, 1.25 NaH,PO,, 3 KCl, 2 MgSO, 10 p-Glucose and 0.5 
CaCl,. All slices from rats and mice were allowed to recover submerged in artificial 
cerebrospinal fluid (ACSF) for 1h at 34°C, and then maintained at room tem- 
perature until recording. For recordings from rat tissue, ACSF contained (in mM) 
122 NaCl, 3 KCl, 10 p-glucose, 1.25 NaH,POy,, 2 CaCl, 1.3 MgCl,, 26 NaHCO3, 
3 sodium pyruvate, 2 sodium ascorbate and 5 L-glutamine. For mouse recordings, 
ACSF contained (in mM) 124 NaCl, 26 NaHCOs, 2.5 KCl, 1.25 NaH2POq, 2 CaCh, 
2 MgSO,, 5 L-Glutamine, and 10 p-Glucose. All slice preparation and recording 
solutions were oxygenated with carbogen gas (95% O2, 5% CO», pH 7.4). 

Electrophysiological recordings. Recordings were performed in a submerged 
chamber at 32-34 °C with constant bath perfusion of ACSF at ~5 ml minute’ 
for rats, ~2 ml minute | for mice. Slices were allowed 15-45 min to equilibrate 
before recording. Because the GABAg blocker CGP52432 (2 1M) did not affect the 
TGOT enhancement of evoked spike probability, recordings were pooled from 
control ACSF (n = 7 cells) and CGP52432 (n = 8 cells) conditions to measure 
spike probability, suppression of spontaneous firing, and evoked spike timing 
(Fig. 1). For cell-attached measurement of TGOT influence on spontaneous activ- 
ity, results were pooled from recordings in control ACSF (n = 15 cells) and in the 
presence of CGP52432 at 2M (n= 8 cells). To prevent ictal activity, the CA3 
region of each slice was removed before recordings in bicuculline. Recordings were 
made using glass pipettes with a tip resistance of 2-4 MQ. For cell-attached 
recordings, pipettes were filled with ACSF and the amplifier was set in voltage 
clamp mode. Slices were visualized with an upright microscope (Zeiss Axioskop 
2 FS plus) using infrared differential interference contrast (IR-DIC) optics. Data 
were recorded with a MultiClamp 700B amplifier (Axon Instruments), filtered at 
10kHz using a Bessel filter and digitized at 20 kHz with a Digidata 1322A ana- 
logue-digital interface (Axon Instruments). For whole cell recordings, experiments 
were discarded if the series resistance changed significantly or reached 20 MQ. 
Spontaneous IPSCs onto pyramidal cells and unitary IPSCs in paired recordings 
were detected in voltage clamp using a high Cl internal solution containing (in 
mM) 70 CsMeSO3, 35 CsCl, 15 TEA-Cl, 1 MgCl, 0.2 CaCl,, 10 HEPES, 0.3 EGTA, 
10 Tris-phosphocreatine, 4 Mg-ATP, and 0.3 Na-GTP. For evoked IPSC and EPSC 
recordings, the internal solution contained (in mM) 130 CsMeSO3, 8 CsCl, 1 
MgCl, 10 HEPES, 0.3 EGTA, 10 Tris-phosphocreatine, 4 Mg-ATP, and 0.3 Na- 
GTP. Bicuculline (10 uM), TTX (100nM) and OTA (1M) were delivered as 
indicated in the bathing solution throughout the recording (Fig. 1, Supplemen- 
tary Figure 1). Calcium channel blockers w-agatoxin IVA at 0.5 1M (AgalVA) or 
@-conotoxin GVIA at 14M (GVIA) were delivered by pretreating the slice for 
30 min in an interface chamber before recording in control ACSF. AgalVA and 
GVIA recordings were performed in separate slices from the same experimental animal. 

Synaptic events were evoked using a tungsten bipolar stimulating electrode placed 
in the Schaffer Collateral excitatory afferents from area CA3 to deliver stimuli 100 1s 
in duration. With the exception of Supplementary Fig. 4k, |, the stimulating electrode 
was placed far from the recorded cell (~400 jm to ~800 um) to minimize mono- 
synaptically evoked IPSCs. In Supplementary Fig. 4k, 1, monosynaptic IPSCs were 
evoked using submaximal stimulation by placing the stimulating electrode in the 
pyramidal cell layer close to the recorded cell (~100 1M), and including 10 1M 
NBQX and 50 uM APS in the bath to block excitatory transmission. For evoked 
IPSP measurement, data were pooled from evoked spike successes and failures and 
from recordings in the presence (n = 5 cells) or the absence (” = 1 cell) of the 
GABAg antagonist CGP52432 (2M). Evoked disynaptic feed-forward IPSCs 
(Figs 2g, h and 4a, b) were recorded as outward currents at a holding potential 
of 0 mV in control ACSF. Evoked EPSCs were isolated by including 10 1M bicucul- 
line in the bath and holding the cell at —65 mV. Two out of 14 recordings in Figs 4c, d 
and Supplementary Figures 6e, f were performed in the continuous presence of 
AM-251 (2 4M) to confirm the persistence of the CCK-induced enhancement of 
EPSP-spike coupling even when endocannabinoid signalling was blocked. 

For current clamp recordings, and all interneuron recordings except for the voltage 
ramp experiments, the intracellular solution contained (in mM) 130 K-Gluconate, 
1 MgCl;, 10 HEPES, 0.3 EGTA, 10 Tris-phosphocreatine, 4 Mg-ATP, and 0.3 Na- 
GTP. For interneuron recordings this solution was supplemented with 0.1% 
biocytin. GTP was omitted in experiments featuring GTPYS. For voltage clamp 
recordings of TGOT-induced currents in FSIs, traces were divided into 10 s segments, 
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with the mean value of each segment plotted as a function of time to exclude 
synaptic events. See Supplementary Fig. 2d for an exemplar raw trace. All record- 
ings were baseline-subtracted to adjust for the leak current measured during the first 
2 min before the onset of the GIP yS-induced current. Traces were time-aligned to 
the wash-in of TGOT (red bar). For one cell in the GTPYS data set in which baseline 
recording period was 10 min rather than 15 min, the pre-TGOT period was aligned 
to the start of the other recordings, and the remainder of the trace starting with 
TGOT wash-in was aligned to the TGOT wash-in of the other traces. 

Low doses of kainate (100-500 nM) were used to establish gamma rhythms in 
hippocampal slices that closely resemble gamma rhythms in vivo"’. Field recording 
electrodes were placed in the strata pyramidale and radiatum to monitor both the 
ongoing gamma oscillation and the EPSP-spike coupling. Field recording electro- 
des were similar to those used for whole cell recordings, but filled with ACSF. 
FSI voltage ramp recordings. For voltage ramp recordings, the internal solution 
contained (in mM) 50 K-Gluconate, 70 CsMeSO3, 10 TEA-Cl, 1 MgCl, 10 
HEPES, 0.3 EGTA, 10 Tris-phosphocreatine, 4 Mg-ATP, and 0.3 Na-GTP. The 
pipette reference potential was set to zero and a junction potential of —15.1 mV 
(calculated using pClamp) was corrected post hoc. An additional, empirically 
measured correction factor of 3.3 mV was applied to correct for a change in the 
junction potential introduced by partial replacement of sodium with NMDG in the 
voltage ramp ACSF. Apart from the voltage ramp recordings, other membrane 
potentials reported are not corrected for liquid junction potentials. After obtaining 
a whole cell recording from a putative interneuron, the fast-spiking phenotype was 
verified as described below. The amplifier was then switched to voltage clamp 
mode and the bath solution was substituted for voltage ramp ACSF containing 
(in mM) 112 NaCl, 10 p-Glucose, 3 KCl, 1.25 NaH2PO,, 10 TEA-Cl, 1.3 MgCl, 
2 CaCl,,26 NaHCO, 5 4-Aminopyridine, 0.1 CdCl, and 0.001 TTX. Voltage 
ramps ~1s in duration between —91 and +29mV were applied once every 
10s until the current at each potential reached a steady state for > 2 min, at which 
point TGOT was applied. In 3 out of 13 recordings the voltage ramp-activated 
current (1) became more negative at all potentials shortly after TGOT application, 
and (2) failed to return to baseline after washout of the drug. It was assumed that 
this global shift was caused by a change in the space clamp or access resistance and 
these recordings were excluded from further analysis. 

Drugs and reagents. All salts and buffers for intracellular and extracellular solu- 
tions, as well as ATP, GTP, GTPYS, phosphocreatine and biocytin were purchased 
from Sigma. TGOT ((Thr*,Gly’)-oxytocin), OTA ((d(CH)s',Tyr(Me)’, Thr4,Orn’, 
des-Gly-NH,’)-vasotocin) and CCK (cholecystokinin octapeptide) peptides were 
purchased from Bachem, dissolved at 1 mM in ddH,O and stored at —20 °C until 
use within 6 months of purchase. Bicuculline, TTX, NBQX and D-AP5 were 
purchased from Ascent Scientific. -conotoxin GVIA and w-agatoxin IVA were 
purchased from Peptides International. Stock solutions were prepared and stored 
according to manufacturer specifications. 

Interneuron labelling and classification. Physiological classification of interneuron 
subtypes was based on established criteria’’*°**. Fast-spiking cells were defined as 
those including (1) peak firing rates > 200 Hz with little firing rate accommodation, 
(2) characteristic FS action potential waveform, and (3) minimal hyperpolarization- 
induced sag current due to J, . Following interneuron recordings, slices were trans- 
ferred to a fixative solution containing 4% paraformaldehyde, 0.2% picric acid 
and 1X phosphate buffered saline for 24-72h before being stained with 3,3’- 
diaminobenzidine tetrahydrochloride (0.015%) using a standard ABC kit (Vector). 
Neuronal cell types were identified based on morphology of axonal and dendritic 
arbors and electrophysiological properties of the cell. The FS perisomatic-targeting 
set includes both basket cells (shown), and axo-axonic cells (not shown). Because 
of technical challenges of discriminating FS basket and axo-axonic cells unequi- 
vocally, both cell types were pooled into a single group of FS perisomatic-targeting 
cells. When analysed separately, both putative types were equivalently responsive 
to TGOT. 

Analysis of cell-attached and intracellular recording data. Analysis of spikes, 
evoked synaptic currents, and synaptic potentials were performed offline using 
custom written routines in MATLAB (MathWorks). Spontaneous IPSCs were 
detected using a modified version of the detectPSPs script by P. Larimer (http:// 
www.mathworks.com/matlabcentral/fileexchange). Spike jitter histograms were 
calculated by subtracting the latency of each spike from the average latency of 
spikes evoked in that cell. The average latency and jitter were calculated separately 
for control and TGOT/CCK conditions in each cell. To measure the spike width”, 
raw data was oversampled to 133 kHz using the MATLAB spline function. Time 
course of spontaneous activity in pyramidal cell attached recordings was calculated 
by averaging over all cells and smoothing in time with a boxcar filter (width = 7 
sweeps). 

Optical stimulation of channelrhodopsin-2. Photostimuli were produced by 
three Luxeon Rebel LEDs (470 nm, Philips Lumileds) driven by a custom-built 
controller. The LEDs were placed below the recording chamber for full slice 
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illumination once stable recording conditions were reached. Light pulses were 
5 ms in duration with a power of approximately 0.5 mW per mm’. ChR2-evoked 
IPSCs were recorded from CA1 pyramidal neurons (” = 10 cells, n = 4 animals). 
Six of these neurons were recorded in the same region of the same slice as neurons 
recorded in the cell-attached data set. 

Data analysis for cell-attached recordings involving blue light stimulation. In 
the full data set, blue light stimulation increased the spike probability in 13 of 16 
neurons (Supplementary Fig. 8a, 12% increase in spike probability including all 
neurons, P< 0.05, paired two-tailed t-test). In recordings from rat neurons the 
average increase in spike firing probability with TGOT or CCK was not correlated 
with spike latency, whereas in the mouse data we found a strong correlation 
between control spike latency and the ChR2-induced increase in spike probability 
(Supplementary Fig. 8d, e). In the mouse data set, the shortest latency spikes 
showed the weakest increase in spike firing probability. Plotting the latency against 
the jitter of spikes elicited under control conditions, we found a clear separation 
between two groups of cells, in which evoked spikes from one set of cells occurred 
with very short latency and little jitter and spikes from another set of cells occurred 
at longer latency and with more jitter (Supplementary Fig. 8f). 

Because of the smaller size of the mouse brain, we found our slice angle to be less 
reliably transverse than in the rat preparation. As a result, the stimulating and 
recording electrodes were placed closer to one another in the mouse slice in order 
for the stimulating electrode to recruit a sufficient number of excitatory Schaffer 
Collateral fibres to drive an action potential in the postsynaptic CA1 pyramidal 
cell. This change in recording configuration unfortunately increases the probabi- 
lity of directly activating inhibitory fibres with the stimulating electrode and gene- 
rating a monosynaptic IPSC. A well-documented set of physiological parameters, 
including synaptic kinetics and cell excitability’ ensure that the physiologically 
relevant disynaptic IPSC arises mostly from FSIs. The monosynaptically activated 
IPSC, however, will arise from a less targeted subset of neurons, and therefore be 
less susceptible to modulation by interventions that selectively target FSIs. 

In the cell-attached recording configuration, it was impossible to determine 
directly the relative monosynaptic and disynaptic contributions to the feed-forward 
IPSC. The monosynaptic IPSC relies only ona single GABAergic synapse, however, 
whereas the disynaptic IPSC relies on three sequential steps: (1) a glutamatergic 
synapse onto the interneuron, (2) the subsequent action potential in the inter- 
neuron, and finally (3) the GABAergic transmission onto the postsynaptic pyr- 
amidal cell. The monosynaptic IPSC will therefore occur with a shorter latency and 
less jitter than the disynaptically evoked IPSC. As a result, spikes in pyramidal cells 
in which the feed-forward IPSC is dominated by a monosynaptic component will 
be expected to occur with a shorter latency and less jitter than spikes in cells 
experiencing a more physiological disynaptic feed-forward IPSC. We therefore 
excluded the tightly clustered group of neurons with very short latency and low 
jitter spikes from the mouse data set ( = 9 cells) and analysed only the neurons in 
which spikes occurred with a longer latency and more jitter, consistent with dis- 
ynaptic feed-forward inhibition (n = 7 cells). All of these remaining cells demon- 
strated an increase in spike firing probability following blue light stimulation (7 out 
of 7 cells, 28% increase in spike probability; P < 0.01 paired two-tailed t-test). In 
the complete data set (1 = 16 cells) we observed a modest increase in spike latency 
following blue light stimulation of PV interneurons across all 16 neurons 
(Supplementary Fig. 8b, c). However, in the 5 out of 7 cells from the restricted 
data set that fired at least 5 spikes in both the control and blue light stimulation 
conditions, light activation of PV interneurons reduced the latency (Supplemen- 
tary Fig. 8g, 10.35 ms in control, 10.07 ms following light stimulation; P = 0.73 
paired two-tailed t-test) and jitter (Supplementary Fig. 8h, 16.58 ms” control; 
11.78 ms” light stimulation; P = 0.23 paired two-tailed t-test) of spikes. Although 
this reduction in latency and jitter did not reach statistical significance, the trend is 
consistent with our TGOT and CCK results. 

Immunohistochemistry. At the end of each ChR2 recording session, slices were 
fixed overnight with 4% paraformaldehyde (PFA) in a phosphate buffered saline 
(PBS) solution and cryoprotected by immersion in 30% sucrose PBS solution over- 
night at 4 °C. Tissues were embedded in Tissue Tek, frozen on dry ice, and cryo- 
sectioned at 20-\1m thickness. Sections for were processed using 1.5% normal goat 
serum (NGS) and 0.1% Triton X-100 in all procedures except washing steps, where 
only PBS was used. Sections were incubated in blocking solution for 1 h, followed 
by incubation with the primary antibodies overnight at 4 °C. Cryostat tissue sec- 
tions were stained with the primary antibodies: mouse anti-Parvalbumin (1:1,000, 
Sigma) and rabbit anti-DsRed (1:500, Chemicon). Secondary antibodies conju- 
gated with Alexa Fluor dyes 488, 594 (Molecular Probes) raised from the same host 
used for blocking serum were applied for 1 h at room temperature. Nuclear coun- 
terstaining was performed with 100 ng ml! 4,6-diamidino-2-phenylindole (DAPI) 


solution in PBS for 5 min. Fluorescent images were captured using a cooled-charge 
coupled device (CCD) camera (Princeton Scientific Instruments) using MetaMorph 
software (Molecular Devices). 

Virus injection. Adeno-associated virus carrying ChR2 fused to the fluorescent 
marker mCherry AAV2/1.EF1.dflox.hChR2(H134R)-mCherry.WPREhGH, (University 
of Pennsylvania Gene Therapy Program Vector Core) was injected bilaterally into 
dorsal hippocampal CA1 region of Pvalb-cre (PV-Cre) transgenic mice” (aged 
between postnatal days 15-19) at three sites: 2.2, 1.8 and 1.6 mm posterior from 
bregma; 2.4, 2.1, 1.7 mm from midline; and 1.2, 1.1, and 1 mm below the cortical 
surface, respectively. Animals were anaesthetized with isoflurane, mounted in a 
stereotactic apparatus and kept under isoflurane anaesthesia during surgery. We 
injected 100 nl of virus at each location over a 2 min period using a glass micro- 
pipette (tip diameter ~20 um) attached to a Nanolitre 2000 pressure injection 
apparatus (World Precision Instruments). The pipette was held in place for 
3 min following each injection before being completely retracted from the brain. 
Mice were returned to their home cage for 2-3 weeks before acute slice preparation 
to allow for virus expression. 

Computational model of EPSP-spike coupling. The computer modelling was 
performed using NEURON and automated using MATLAB. A simplified pyr- 
amidal cell, consisting of a soma, a single axon anda single dendrite was initialized 
to starting parameters before each stimulus. Background and voltage-gated con- 
ductances were based on reported models**”*. Small adjustments were made to 
improve agreement of parameters such as cell excitability and action potential 
waveform between the model and experimental observations. Each sweep consisted 
of (1) a ‘monosynaptic’ EPSC onto the dendrite, (2) a ‘disynaptic’ feed-forward 
IPSC onto the soma and dendrite 2 ms after the evoked EPSC (unless otherwise 
specified), and (3) multiple ‘spontaneous’ IPSCs onto the soma with randomly 
distributed amplitudes and timing. To isolate the role of the feed-forward IPSC 
from changes in inhibitory-tone, spontaneous IPSCs were omitted in the simu- 
lation used to generate Fig. 4h, i. At the outset of each set of sweeps, the ‘evoked’ 
EPSC-IPSC amplitudes were set empirically by increasing the EPSC and IPSC 
conductances together with a fixed ratio of 6:1 until ~50% chance of spike pro- 
pagation was reached. Experimental measurement of IPSC/EPSC ratio ranged 
from 2.62 to 5.20 (mean + s.e.m. of 3.65 + 0.28). This experimentally measured 
range is presumed to be an underestimate of the true ratio due to imperfect 
isolation of the IPSC reversal potential, causing a presumed GABAergic contri- 
bution to the measured EPSC in some cells. In the model, IPSC/EPSC ratios from 
4:1 up to 6:1 showed a pronounced TGOT-induced increase in evoked spike proba- 
bility, with 6:1 supporting the strongest influence of TGOT on spike timing and 
jitter. Variability was introduced by using pseudo-random number generation to 
vary independently (1) the evoked EPSC conductance, (2) the evoked IPSC con- 
ductance and (3) the spontaneous IPSC timing and amplitudes. Evoked EPSC and 
IPSC conductances were varied independently on each sweep according toa normal 
distribution centred on the empirically determined mean value, with a standard 
deviation that was 5% of the mean. TGOT was simulated by (1) reducing the evoked 
somatic IPSC conductance to 60% of ‘baseline’, while sparing the evoked EPSC and 
the dendritic IPSC, (2) doubling the spontaneous IPSC amplitude, and (3) increas- 
ing the spontaneous IPSC rate from 5 Hz to 35 Hz. The IPSC reversal potential was 
set at —110mV for Supplementary Fig. 10 b, c, consistent with the calculated 
GABAg reversal potential in our whole cell recording conditions. For the rest of 
the simulations, the IPSC reversal potential was set to —90 mV, consistent with cell- 
attached recording conditions. The increase in evoked spike probability was robust 
as the GABA, reversal potential was varied from —80 mV to —120 mV (Supplemen- 
tary Fig. 12), while the reduction in latency and latency jitter were decreased in 
magnitude but remained statistically significant as the GABA, reversal potential 
approached the neuron resting membrane potential. 
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Phosphorylated sphingolipids ceramide-1-phosphate (C1P) and 
sphingosine-1-phosphate (S1P) have emerged as key regulators of 
cell growth, survival, migration and inflammation’ >. C1P produced 
by ceramide kinase is an activator of group IVA cytosolic phospho- 
lipase Aa, (CPLA, q@), the rate-limiting releaser of arachidonic acid 
used for pro-inflammatory eicosanoid production**”°, which contri- 
butes to disease pathogenesis in asthma or airway hyper-responsiveness, 
cancer, atherosclerosis and thrombosis. To modulate eicosanoid action 
and avoid the damaging effects of chronic inflammation, cells require 
efficient targeting, trafficking and presentation of C1P to specific 
cellular sites. Vesicular trafficking is likely’® but non-vesicular mecha- 
nisms for C1P sensing, transfer and presentation remain unexplored’”. 
Moreover, the molecular basis for selective recognition and binding 
among signalling lipids with phosphate headgroups, namely C1P, 
phosphatidic acid or their lyso-derivatives, remains unclear. Here, 
a ubiquitously expressed lipid transfer protein, human GLTPD1, 
named here CPTP, is shown to specifically transfer C1P between 
membranes. Crystal structures establish C1P binding through a novel 
surface-localized, phosphate headgroup recognition centre connected 
to an interior hydrophobic pocket that adaptively expands to ensheath 
differing-length lipid chains using a cleft-like gating mechanism. 
The two-layer, a-helically-dominated ‘sandwich’ topology identifies 
CPTP as the prototype for a new glycolipid transfer protein fold’* 
subfamily. CPTP resides in the cell cytosol but associates with the trans- 
Golgi network, nucleus and plasma membrane. RNA interference- 
induced CPTP depletion elevates C1P steady-state levels and alters 
Golgi cisternae stack morphology. The resulting C1P decrease in 
plasma membranes and increase in the Golgi complex stimulates 
cPLA, @ release of arachidonic acid, triggering pro-inflammatory 
eicosanoid generation. 

During screening of the NCBI human genome database, we noted 
an in silico predicted transcript (GenBank NP_001025056.1; glycolipid 
transfer protein domain-containing protein 1; GLTPD1) encoding a 
protein sharing sequence identity (17%) with glycolipid transfer protein 
(GLTP). Although annotation indicated glycolipid binding and trans- 
port activity, Lys and Arg substitutions occurred at key positions (N52, 
W96) essential for sugar headgroup recognition by GLTP (Supplemen- 
tary Fig. 1a; yellow highlights)'*'*. We validated GLTPD1 mRNA tran- 
script expression in human tissues, finding widespread occurrence and 
relatively elevated transcript levels in placenta, kidney, pancreas and testis 
(Fig. 1a). Cloning and heterologous expression revealed that purified 
GLTPD1 (GenBank JN542538) transfers anthrylvinyl-C1P (AV-C1P) 
between 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine (POPC) 
bilayer vesicles in a protein concentration-dependent fashion (Fig. 1b, c 
and Supplementary Fig. 1b, d) requiring acceptor membranes (Sup- 
plementary Fig. 1c). Testing of lipid specificity revealed no transfer of 
galactosylceramide (GalCer), lactosylceramide (LacCer), sphingomyelin 


(SM), phosphatidylcholine (PC), phosphatidic acid (PA) or ceramide 
(Cer) by GLTPD1 (Fig. 1b, c). Slow-down of AV-C1P transfer by 
potential lipid ligands (nonfluorescent) showed no competition effect 
by S1P (Fig. 1d and Supplementary Fig. le, f) and a transfer rate of 
~4 C1P molecules per min per protein molecule at 37 °C. We have 
designated GLTPD1 as ceramide-1-phosphate transfer protein (CPTP). 

The crystal structure of human CPTP and 16:0-C1P complex (1.9 A, 
Supplementary Table 1) revealed a two-layered, all «-helical topology 
(Fig. le-g) homologous with the GLTP fold’*. «N, «1 and «2 form one 
layer, #4, #5 and «8 form another layer, and «3, «6 and «7 localize 
along the periphery of the two-layer core. A positively-charged surface 
cavity for anchoring the lipid headgroup (Fig. 1h) extends through a 


Lo 
x 
a 
a 2 & e oe 
oF =O 
bp = at 
700 + 
600 J ss 
O: 

s c 

€ s 
= 40 £ ‘s 
235 = 2 
52 3 c 
£g 3.0: aoe @ 
o Ey D 
Qe25 £ Le 
88 E\ \§ 

"6 2.0: a 

3s ae @ 
on15 
z 0 
ir 0.0 05 1.0 20 


Lipid competitor (mole %) 


Figure 1 | CPTP lipid transfer activity and architecture. a, CPTP mRNA 
transcript levels in various human tissues. bp, base pairs. b, Lipid transfer in 
vitro by Forster resonance energy transfer. c, Initial lipid transfer rates for panel 
b. d, Competition against CPTP-mediated AV-C1P transfer by nonfluorescent 
lipids. Kinetic traces are shown in Supplementary Fig. le-g. Data in c and 

d represent the mean + s.d. of three independent experiments. e, 16:0-C1P 
chemical formula. f, g, Two views of CPTP structure (ribbon) with bound 16:0- 
C1P (space-filling). «-helices (cyan), 39-helices (light blue), loops (orange) and 
bound 16:0-C1P (yellow, red, blue for carbon, oxygen, nitrogen, respectively). 
a-helices (ZN and «1-«8) are numbered from amino (N) to carboxy (C) 
termini. h, Surface electrostatics of CPTP with bound 16:0-C1P showing 
positive- (blue) and negative- (red) charged residues. 
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gateway portal, transforming into a deep interior hydrophobic cavity 
that accommodates the sphingosine and acyl chains. Sealing of the 
cavity bottom by Leu 10 and Leu 14 («N-helix) renders it pocket-like. 
A triad of cationic residues (K60, R106, R110) in the surface cavity 
recognizes and binds the C1P phosphate headgroup (Fig. 2a). The 
anchoring hydrogen-bond network is complex, involving bifurcated 
hydrogen bonding by K60 («2-helix) with the O1 and O2 atoms, 
bidentate hydrogen bonding by R106 («4-helix) with the O2 and O3 
atoms, and bidentate hydrogen bonding by R110 («4-helix) directly 
and through water bridging to O3. Point mutation data supports key 
roles for K60 and R106 in C1P headgroup recognition with K60A and 
R106L showing almost no C1P transfer, whereas the R110 mutation 
(R110L) shows ~40% transfer (Fig. 2b). The positive-charge of this site 
also is enhanced by R97 («3-4 loop) which hydrogen bonds to the O2 
atom of phosphate through water-bridging. R97 is almost fully active 
when mutated to Leu (R97L), but mutation to acidic Glu (R97E) 
reduces C1P transfer to ~55%, supporting its role for attracting the 
lipid phosphate headgroup. Cation-pi interaction between R113 (a4- 
helix) and Y149 («5-6 loop) provides stabilizing underpinning for the 
site (Supplementary Fig. 2c), as the R113 mutation (R113L) strongly 
diminishes activity. Mutants Y149A, R113E and R113L show poor C1P 
transfer, as expected by conformational destabilization (Fig. 2b). All 
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Figure 2 | CPTP conformation and functional recognition of C1P. a, CPTP 
lipid headgroup recognition centre residue interaction with phosphate and 
amide groups of bound 16:0-C1P (ball-and-stick). Hydrogen bonds shown as 
dashed lines. CPTP Co. backbone is light grey; side chains, cyan; and oxygen 
and nitrogen, red and blue, respectively. Water molecules are pink spheres. 

b, C1P transfer by CPTP point mutants of phosphate headgroup recognition 
cavity (cyan) or wild-type CPTP (wtCPTP, grey). c, Non-polar residues 
forming hydrophobic pocket that accommodates 16:0-C1P sphingosine and 
acyl chains. d, C1P transfer by CPTP point mutants (violet) of the hydrophobic 
pocket. wtCPTP (grey); Side-chains shown in panel c. Data in b and d represent 
the mean + s.d. of three independent experiments. e, Conformational changes 
in hydrophobic pocket upon 16:0-C1P binding. Side-chains of apo-CPTP 
(lavender; stick) and human CPTP (cyan; stick) with bound 16:0-C1P (yellow; 
ball-and-stick). f, g, Surface electrostatics of hydrophobic pocket opening at 
lipid headgroup recognition sites in apo-CPTP (f) and 16:0-C1P-CPTP 
complex (g). h, 12:0-C1P chemical formula. i, j, Crystal structures of CPTP 
(ribbon) with bound 12:0-C1P in sphingosine-in (i; beige) and sphingosine-out 
(j; pink) conformations. k, Superposition of bound 12:0-C1P in sphingosine-in 
and sphingosine-out conformations. 
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key residues of the phosphate recognition site appear to be conserved 
in eukaryotes (Supplementary Fig. 2a). 

The hydrophobic pocket is lined by ~25 nonpolar residues, mainly 
Phe, Leu, Val and Ile (Fig. 2c and Supplementary Fig. 2b) that prevent 
water entry while ensheathing the ceramide aliphatic chains. Mutation 
of L43, L118 or L146 to positively-charged Arg, or V57 or V158 to high 
polarity Asn compromises hydrophobic pocket functionality and 
strongly diminishes C1P transfer (Fig. 2d). More conservative muta- 
tion (for example, W117A) only moderately reduces C1P transfer, 
whereas F42A near the pocket bottom stimulates C1P transfer. Mutation 
near the entry portal (153N) or in the flexible «1-2 loop (F50R) is well 
tolerated (75-80% active) (Fig. 2d). Ceramide entry is oriented by 
hydrogen bonding of the lipid amide oxygen and nitrogen with 
H150 and D56, respectively. Hydrogen bond disruption between lipid 
amide nitrogen and D56 (D56V) moderately slows C1P transfer, but 
H150 mutation (H150L) abolishes activity. Superpositioning of apo- 
and 16:0-C1P-CPTP structures (root mean squared deviation (r.m.s.d.) 
1.4 A) shows K60, R106 and R110 nearly identically positioned in the 
positively-charged surface cavities. Yet, large conformational differ- 
ences exist for 153, W36, W119 and F52 (Fig. 2e) due to closer packing 
of certain o-helices in apo-CPTP (Supplementary Fig. 2d, e). Many Leu 
and Phe are repositioned, reducing the solvent accessible volume 
(40 A*) (Supplementary Table 4) and effectively collapsing the hydro- 
phobic pocket (Fig. 2f, g) compared to the 16:0-C1P-CPTP complex 
(364 A?). 

CPTP adaptability for different C1P species is reflected in structures 
of CPTP complexed with C1P containing differing-length acyl chains 
(Figs 2h-k, 3a-f and Supplementary Fig. 4c-f). Two lipid-binding 
conformational modes are apparent. In ‘sphingosine-in’ mode, both 
ceramide chains occupy the hydrophobic pocket, whereas only the 
acyl chain occupies the pocket in the ‘sphingosine-out’ mode. For the 
12:0-C1P-—CPTP complex (Fig. 2h-j), both binding modes occurred in 
the same asymmetric unit (Fig. 2i, j and Supplementary Figs 3c, d and 
4a, b) enabling comparison (Fig. 2k) under the closest possible condi- 
tions. The lipid phosphate headgroups and amide groups bind exactly 
as in the 16:0-C1P-CPTP complex (Supplementary Figs 2b and 4a, b). 
In sphingosine-out mode (Fig. 2j and Supplementary Fig. 3d), a bend 
in sphingosine at C6 is stabilized by hydrophobic interactions with 
V153, V154 and the D56 Cf atom, enabling outward projection. 
Sphingosine cross-bridging interactions with F50, 1149, A157 and 
V153 of neighbouring, symmetry-related CPTP stabilize further (Sup- 
plementary Fig. 4b). Solvent accessible pocket volumes reflect the 
altered sphingosine location, that is, 261 A° for sphingosine-out versus 
329 A® for sphingosine-in (Supplementary Table 4). In 18:1-C1P- 
CPTP complex (Fig. 3a, b), the cis-double bond kink in the acyl chain 
increases separation from the sphingosine chain, maximally expanding 
the pocket (stereo view; Supplementary Fig. 4c, d) although leaving the 
overall chain length in the pocket similar to 16:0-C1P. Accordingly, the 
solvent accessible volume of the hydrophobic pocket of 18:1-C1P- 
CPTP is larger (387 A’) (Supplementary Table 4) than in 16:0-C1P- 
CPTP where slightly closer packing by the saturated acyl chain 
decreases the solvent accessible volume (364 A°). Shortening the acyl 
chain length reduces solvent accessible pocket volumes to 104 A® for 
8:0 and 263 A? for 2:0 (Supplementary Table 4). Structures for 2:0- 
C1P-CPTP (sphingosine-in) and 8:0-C1P-CPTP (sphingosine-out) 
are detailed in Supplementary Figs 3a, b and 4e, f and Supplemen- 
tary Discussion. 

The functional consequences of CPTP hydrophobic pocket struc- 
tural adaptability become clear upon transfer analyses. Pocket expan- 
sion accommodates ceramide aliphatic chains in ‘molecular ruler’-like 
fashion with CPTP adaptability limits optimized for 16:0- or 18:1-C1P 
species which are particularly effective competitors at slowing the AV- 
CIP transfer rate (Fig. 3g), consistent with maximal pocket expansion 
and optimal fit (Supplementary Table 4). It is noteworthy that C1P 
containing long lignoceryl (24:0) acyl chains are not very effective 
competitors, suggesting poor accommodation in the hydrophobic pocket 
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Figure 3 | CPTP accommodation and 
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because of adaptation limitations. Also, 16:0-C1P with dihydrosphingosine 
base competes less effectively than 16:0-C1P with naturally prevalent 
sphingosine base. 

Structure determination of the dil2:0-PA-CPTP complex elucidated 
the molecular basis of phosphatidic acid non-transfer (Supplementary 
Fig. 5a—h and Supplementary Table 1). Phosphatidic acid occupies the 
same binding site and its phosphate group interacts with the same 
positively charged residues as C1P (Supplementary Fig. 5b-d). Yet, 
K60 hydrogen bonding is single rather than bifurcated, and the lack 
of the acyl-amide moiety results in no hydrogen bonding with D56, 
distorting the position of the phosphate headgroup and both lipid 
chains and loosening phosphatidic acid binding. The distorted inter- 
action mitigates phosphatidic acid transfer by CPTP (Supplementary 
Fig. 5e-h and Supplementary Discussion). 

CPTP architecture not only represents a new motif for specific 
binding of phosphosphingolipids, but is previously unknown for any 
phosphate-modified biomolecule’*’’. In CPTP, the fixed cationic resi- 
dues of the phosphate recognition site undergo minimal conforma- 
tional change upon C1P binding. B-factor distribution analyses show 
the regions between «1-012 and «5-«6 are most flexible (Supplemen- 
tary Fig. 6a, b), consistent with a cleft-like gating mechanism facilitat- 
ing ceramide chain entry or exit. The conserved lipid orientation in the 
pocket, with the nonpolar acyl chain always inside regardless of sphin- 
gosine being in or out, supports a concerted mechanism of action in 
which the acyl chain enters first and leaves last during membrane 
interaction (Supplementary Fig. 6c, d and Supplementary Discussion). 

The conformational adaptability of the inherently flexible, single- 
cavity, hydrophobic pocket of CPTP contrasts with lipid cavities in 
fatty acid binding proteins which use B-barrels/B-cups to generate a 
large, solvent-filled binding site that remains conformationally fixed 
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C1P (light blue) and 12:0-C1P (pink) lipid chains 
in their sphingosine-out CPTP complexes. 

g, Competition effects by C1P species containing 
different acyl chains on the CPTP-mediated initial 
transfer rate of fluorescent C1P. Data represent the 
mean = s.d. of three independent experiments. 
h-n, CPTP localization to trans-Golgi and 
endosomes in vivo. h, SDS-polyacrylamide gel 
electrophoresis (SDS-PAGE) or immunoblot of 
BSC1 whole cell lysate, (anti-CPTP label), reveals a 
single immunoreactive band (arrow). Dye front 
position indicated by arrowhead. i, Anti-CPTP 
decorates perinuclear membrane stacks, nuclear 
membrane and plasma membrane (arrows). 

j-l, Perinuclear membrane stacks detected by 
anti-CPTP are trans-Golgi. j, BSC1 cell Golgi 
stacks labelled with anti-GM130 (cis-Golgi 
marker), anti-CPTP and anti-TGN 46 (trans- 
Golgi marker). k, Pseudo-colour overlay of frames 
from j. 1, High magnification insets from k show 
CPTP/TGN-46 co-localization and intensity 
scattergram analyses with measured correlation 
coefficients’. m-n, Anti-CPTP co-localization 
with late (m, anti-Rab9 co-labelling) and early 

(n, anti-Rab5 co-labelling) endosomes. Wide-field 
fluorescence microscopy, scale bars, 10 um 

(i), 2.5 jum (k), 1.0 ptm (1). 


whether or not occupied by fatty acid’*. A single, fixed, lipid binding 
cavity also is characteristic of START lipid binding domains in PC 
transfer protein’? and CERT (ref. 20), which uses an «/B fold built 
around an incomplete U-shaped f-barrel to bind ceramide” (Sup- 
plementary Fig. 7 and Supplementary Discussion). 

In the human genome, the differing origins of CPTP and GLTP are 
clear. CPTP (214 amino acids) is encoded by a three-exon transcript 
originating from GLTPD1 on chromosome 1 (locus 1p36.33). GLTP 
(209 amino acids) is encoded by a five-exon transcript originating 
from GLTP on chromosome 12 (locus 12q24.11)**. The shared folding 
topology encoded by GLTPD1 and GLTP, despite only limited sequence 
homology (Supplementary Fig. 8a-e) and different lipid specificity, 
provides a striking example of evolutionary convergence and empha- 
sizes the structural premium placed by eukaryotes on conservation of 
this fold?*-?°. The related architectures of CPTP and GLTP, but with 
naturally evolved and remarkably different lipid headgroup specificity 
(Supplementary Discussion), suggest that the term ‘sphingolipid trans- 
fer protein (SLTP) superfamily’ might better reflect the existence of the 
two major subfamilies: CPTP, with selectivity for ceramide-linked 
phosphates; and GLTP, with selectivity for ceramide-linked sugars. 

In cells, CPTP tracked by monospecific antibody or fluorescent 
epitope tag enhanced green fluorescent protein (EGFP) is localized 
in the cytosol but also associates with perinuclear membranes (for 
example, Golgi/trans-Golgi network (TGN)/endosomes), nuclei and 
plasma membranes (Fig. 3i-n and Supplementary Fig. 9). No local- 
ization to mitochondria, lysosomes or the endoplasmic reticulum is 
detected. CPTP co-localization with TGN-46 verified interaction with 
the TGN, a site where ceramide kinase (CERK) generates C1P (refs 3, 
26-28) and led us to propose a C1P regulatory/sensing role for CPTP 
during CERK-mediated metabolic/signalling events. Short interfering 
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RNA (siRNA)-induced CPTP downregulation (~90%; Supplementary 
Fig. 10a) elevated both 16:0-C1P and 24:1-C1P (~4-fold) (Fig. 4a) 
and fragmented the Golgi cisternal stacks (Supplementary Fig. 11). 
RNAi-induced C1P changes were partially rescued with moderately- 
active R110L and K60N, but not with inactive K60A and R106L 
mutants (Fig. 4b). CPTP overexpression in the absence of RNAi 
decreased 16:0-C1P and 24:1-C1P. K60A or R106L overexpression 
had the opposite effect (Supplementary Fig. 12) and fragmented the 
Golgi cisternal stacks (not shown) consistent with a dominant-nega- 
tive effect. CPTP depletion measurably decreased sphingosine and S1P 
(Supplementary Fig. 10b, c), 14:0-, 22:0-, 24:1- and 24:0-sphingomyelin, 
16:0-monohexosylceramide, and 24:0-Cer, but modestly increased 
24:1-Cer (Supplementary Fig. 10d-f and Supplementary Discussion). 
Subcellular fractionation showed increased 16:0-C1P levels in ‘heavy’ 
membranes (TGN/endosome-enriched) and nuclear fractions, and 
decreased levels in plasma membranes without affecting levels in ‘light’ 
membranes (cis-Golgi/endoplasmic reticulum (ER)-enriched). These 
CPTP-depletion induced effects were rescued by ectopic wild-type 
CPTP expression (Fig. 4g, Supplementary Fig. 13c and Supplem- 
entary Discussion). 
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Because CERK-generated C1P induces Group IVA cPLA2« activity 
which releases arachidonic acid used to produce pro-inflammatory 
eicosanoids”*”’, CPTP involvement was assessed. In siRNA-induced 
CPTP-depleted cells, arachidonic acid increased (Fig. 4c) consistent 
with C1P accumulation at the Golgi/TGN that activates cPLA,% 
(ref. 26). Also elevated were major arachidonic acid metabolites gen- 
erated by cyclooxygenase (COX), lipoxygenase (LOX) and cytochrome 
P450 (CYP) pathways, that is, PGE2, PGF2a, 6-keto-PGFla (COX, 
Fig. 4d); SHETE, SHETE, 12HETE (LOX, Fig. 4e); 11,12 EET (CYP, 
Fig. 4f). By contrast, arachidonic acid and eicosanoid levels decreased 
upon overexpression of wild-type CPTP but not the K60A or R106L 
mutant (Supplementary Fig. 14). Parallel siRNA-induced downregu- 
lation of CERK (Supplementary Fig. 10), the only established producer 
of C1P in mammals, decreased 16:0-C1P (Fig. 4a), arachidonic acid 
(Fig. 4c), and eicosanoids (Fig. 4d-f) elevated by CPTP depletion. 

Figure 4h depicts a model showing how CPTP could regulate pro- 
inflammatory eicosanoid generation. In mammals, the only established 
pathway for C1P production is through phosphorylation of ceramide 
by CERK at the cytoplasmic surface of the TGN***. CERK also contains 
nuclear localization/export signals and traffics to the plasma membrane 
by microtubule-driven vesicles in response to hyperosmotic shock”*. 
To produce C1P, CERK uses ceramide delivered from its endoplasmic 
reticulum synthetic site to the Golgi by either CERT (ref. 27) or pos- 
sibly by vesicular trafficking’’. C1P elevation by CERK is known to 
activate soluble cPLA.% by enhancing translocation to the TGN*”®, 
where cPLA, « action releases arachidonic acid needed by eicosanoid 
producers such as COX-1 or COX-2. siRNA-induced CPTP depletion 
elevates C1P in the Golgi complex and nucleus, but lowers C1P plasma 
membrane levels. We propose that CPTP prevents excess C1P accu- 
mulation after production by CERK, thereby regulating cPLA, action, 


Figure 4 | CPTP siRNA depletion/rescue effects on cellular C1P levels, 
arachidonic acid generation and eicosanoid release and model of CPTP cell 
biological function. a, Intracellular C1P levels of siRNA-treated and control 
A549 cells. Acyl composition of C1P species with sphingosine base chains 
(djg.1) shown on the x axis. b, Rescue effect by partially active K60N and R110L 
CPTP mutants, but not inactive K60A and R106L CPTP mutants with siRNA 
targeted to endogenous the CPTP 3’ UTR. c-f, Arachidonic acid (AA) and 
eicosanoids (COX, LOX and CYP pathways) secreted into media by siRNA- 
treated and control A549 cells (blue, siControl; red, siCPTP; green siCERK). 
Data for b and d-g represent averages of 6 experiments (2 procedures) by 
Student’s t-test (*P < 0.05, **P < 0.01; NS, not significant). 6keto-PGF1a, the 
primary prostacyclin metabolite, is nearly unaffected by siCPTP or siCERK 
suggesting existence of an arachidonic acid pool derived independently of C1P- 
activated cPLA,%. g, Changes in C1P levels at various subcellular locations by 
CPTP-siRNA. Shown on the x axis are the C1P species containing sphingosine 
base chains (d;.1) and either 16:0 or 24:1 acyl chains. PM, plasma membranes; 
LM, light membranes (ER/cis-Golgi enriched); HM, heavy membranes (trans- 
Golgi/endosome/mitochondria enriched); NM, nuclear membranes. Data 
represent the mean = s.d. of three independent experiments. h, Model for 
CPTP regulation of eicosanoid production. C1P is synthesized by ceramide 
kinase (CERK) which concentrates in the trans-Golgi network vicinity via its 
pleckstrin homology (PH) domain during stimulation. To produce C1P, CERK 
uses ceramide transported from the ER by ceramide transfer protein (CERT) 
which also contains a targeting PH domain. After synthesis, C1P is transported 
to subcellular destinations by CPTP and possibly by vesicular trafficking. RNAi 
knockdown of CPTP (shown by the red x) leads to accumulation and elevation 
of C1P at the Golgi complex, a condition that activates soluble cytosolic 
phospholipase A,« (cPLA,«), releasing arachidonic acid needed for generation 
of downstream, pro-inflammatory eicosanoids. CPTP overexpression has the 
opposite effect on lipid levels. cPLA” activation occurs by translocation from 
the cytoplasm and/or cis-Golgi (red arrows) through enhanced anchoring to 
C1P generated by CERK in the TGN. COX-1 and inducible COX-2 which use 
arachidonic acid to produce pro-inflammatory prostaglandins, also 
concentrate in the TGN vicinity during stimulation. For clarity, other 
eicosanoid generator pathways, that is, LOX (for example, cytoplasmic 
5-lipoxygenase) and CYP (ER-associated cytochrome P450), are not depicted. 
Also not depicted is Golgi cisternal stack fragmentation induced by 24h of 
CPTP RNAi. 
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diminishing arachidonic acid release and downstream generation of 
pro-inflammatory eicosanoids. One destination for CPTP cargo is the 
plasma membrane. Models involving CPTP in catabolic C1P genera- 
tion by sphingomyelinase D are less plausible because mammalian cells 
lack this enzyme’ (Supplementary Discussion). 

Previously, the only identified mechanism for regulating CERK- 
mediated production of C1P was by control of ceramide availability 
through ceramide transfer protein’. It is noteworthy that siRNA- 
induced CPTP depletion yields the highest increase in endogenous 
C1P reported to date, mostly as 16:0-C1P, and dramatically alters 
Golgi cisternal stack morphology indicating CPTP-mediated trans- 
port is essential for maintaining proper Golgi organization by safe- 
guarding localized C1P levels. The ensuing stimulation in eicosanoid 
production triggered by elevated C1P in the Golgi complex potentially 
implicates CPTP in, as of yet unidentified, disease states associated 
with inflammation. 


METHODS SUMMARY 


CPTP was cloned and expressed in BL21 (DE3) Star cells using pET-SUMO vector. 
Lipid intervesicular transfer assays were performed by Forster resonance energy 
transfer. X-ray diffraction data were collected on crystals of mouse apo-CPTP and 
of human CPTP with bound C1P containing acyl chains of differing length, for 
example, 2:0, 8:0, 12:0, 16:0 and 18:1, as well as di-12:0 PA. For phasing, single- 
wavelength anomalous dispersion data were collected for Se-Met-labelled, 8:0- 
C1P-CPTP crystal complex at Se peak wavelength (Supplementary Table 2) enabling 
other structures to be solved by molecular replacement. Data collection, processing, 
structure solution and refinement are described in the Methods. Protein docking 
with membranes was performed using the Orientation of Proteins with Membranes 
modelling. Epifluorescence microscopy images of fixed BSC-1 cells were captured 
by labelling with anti-CPTP antibody and anti-TGN-46, anti-GM130, anti-p230, 
anti-Rab5 or anti-Rab9 followed by secondary antibodies coupled to Alexa Fluor 
488, Alexa Fluor 594, or Alexa Fluor 660 and counterstaining with DAPI (4’,6- 
diamidino-2-phenylindole). Reverse transcription PCR and quantitative PCR were 
used to evaluate endogenous CPTP mRNA tissue levels and siRNA-downregulated 
CPTP and CERK transcript levels. Sphingolipids were analysed by electrospray 
ionization tandem mass spectrometry (ESI-MS/MS) after separation by HPLC. 
Eicosanoids were analysed as detailed by the Lipid Maps Consortium. 


Full Methods and any associated references are available in the online version of 
the paper. 
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METHODS 

Protein expression and purification. GLTPD1 ORFs encoding human and 
mouse (GenBank JN542538 & NP_077792.2) CPTP were cloned in pET-SUMO 
vector (Invitrogen) and expressed in BL21 (DE3) Star cells (Invitrogen). Soluble 
CPTP tagged N-terminally with Hiss-SUMO was affinity-purified by Ni-NTA 
chromatography followed by ubiquitin-like protein 1 (Ulp1) SUMO protease 
digestion overnight at 4°C to release the Hiss~SUMO tag. Affinity repurification 
by Ni-NTA chromatography was followed by FPLC gel filtration chromatography. 
L-selenomethionine (Se-Met)-labelled protein for ab initio phasing was produced 
by feedback inhibition of the methionine synthesis pathway. Mutants were con- 
structed by QuikChange Site-Directed Mutagenesis (Stratagene) and verified by 
sequencing. 

CPTP lipid transfer activity involving membrane vesicles. Intermembrane lipid 
transfer by CPTP was measured in real time by Férster resonance energy transfer 
(FRET) between donor POPC vesicles, containing 1 mole % AV-lipid (acyl chain 
omega-labelled with anthrylvinyl fluorophore, that is, (11£)-12-(9-anthryl)-11- 
dodecenoyl), and 1.5 mole % 1-acyl-2-[9-(3-perylenoyl)-nonanoyl]-3-sn-glycero- 
3-phosphocholine (Per-PC) and POPC acceptor vesicles at tenfold excess. In competition 
assays, donor vesicles also contained competitor lipids at 0.5, 1.0 and 2.0 mole % 
(ref. 30). Briefly, CPTP addition produces an exponential increase in AV emission 
intensity (425 nm) as the protein transports AV-C1P away from the donor vesicles 
(creating separation from the ‘nontransferable’ Per-PC) and delivers to the POPC 
acceptor vesicles present in tenfold excess. The time-dependent increase in 425 nm 
emission relative to signal in the absence of CPTP reflects lipid transfer kinetics. 
In the absence of acceptor vesicles, no transfer is observed. The initial lipid 
transfer rate, Vo, is obtained by nonlinear regression analyses (see Supplemen- 
tary Methods). 

Crystallization and structure determination. Crystallization hits from initial 
screens were optimized by the hanging drop vapour diffusion method and sys- 
tematically varying pH and individual component concentrations (Supplemen- 
tary Table 3). For data collection, crystals were flash frozen (100 K) in reservoir 
solutions containing 20% (v/v) ethylene glycol. Diffraction data sets were collected 
on 24-ID-C and 24-ID-E beamlines at the Advanced Photon Source (APS) and 
X29 beamline at Brookhaven National Laboratory. All crystals belonged to differ- 
ent crystal forms. For phasing, single-wavelength anomalous dispersion (SAD) 
data were collected for Se-Met-labelled, 8:0-C1P-CPTP crystal complex at Se peak 
wavelength (Supplementary Table 2; see Supplementary Methods). Use of Se- 
CPTP structure enabled other structures to be solved by molecular replacement 
(Supplementary Methods). Statistics for data collection, refinement and SAD 
phasing are provided in Supplementary Tables 1 and 2. 

Epifluorescence microscopy analyses. BSC-1 cells on coverslips were fixed in 
—20°C methanol and labelled with anti-CPTP (Santa Cruz Biotechnology, 
sc247014), and Golgi markers anti-TGN46, anti-GM 130 and anti-p230, or endo- 
some markers anti-Rab5 and anti-Rab9 (Cell Signaling) followed by secondary 
antibodies coupled to Alexa Fluor 488, Alexa Fluor 594 or Alexa Fluor 660. Cells 
were counter-stained with DAPI, mounted in 10% PBS, 90% glycerol, imaged 
using a Leica DM RXA2 microscope with a X63 1.4NA APO C objective, a 
Hamamatsu ORCA ER CCD camera, and Simple PCI software and analysed as 
intensity scattergrams with measured correlation coefficients*’. Time-lapse images 
of living cells expressing EGFP-CPTP were captured with a Leica DM RXA2 
microscope stand equipped with a Yokagawa CSU-10 spinning disk confocal head 
and using illumination from a Coherent 488 nm 200 mW ‘Sapphire’ continuous 
wave optically pumped solid-state laser and 31 Slidebook software” (see Sup- 
plementary Methods for more details). 

Immunoblot analysis. BSC1 cells were grown to semi-confluence, collected by 
manual scraping, pelleted, and boiled in SDS-PAGE buffer. Proteins were sepa- 
rated on a 10% discontinuous SDS-PAGE gel, transferred to PDVF membrane 
and immuno-labelled”’. The immunoreactive band was detected by chemilumin- 
escence (Image Quant system, GE Healthcare). 

siRNA-mediated CPTP downregulation and rescue in cells ectopically expres- 
sing wild-type and mutant CPTP constructs. Low passage A549 cells (5 X 10°) 
were grown (10-cm plates) in appropriate medium under standard incubator 
conditions overnight. Cells were treated with siRNA (Dharmacon) against 
CPTP (GLTPD1) or CERK as well as non-targeting siRNA sequence for control 


per manufacturer’s protocol and incubated for 48h under standard incubator 
conditions. For rescue experiments, cells were transfected with either the empty 
pFLAG-CMV4 (Nec’) plasmid or this vector containing wild-type CPTP, CPTP(K60A), 
CPTP(K60N), CPTP(R106L) or CPTP(RI10L). Batch cultures of cells stably 
expressing the transfected constructs were obtained by selection for two weeks 
in regular medium containing G418 (genticin, 500 1gml~') under standard 
incubator conditions. Following selection, cells (5 10°) were transferred to 10- 
cm tissue culture plates and cultured overnight in regular media without G418 
under standard incubator conditions. Cells then were treated with either control 
siRNA or a mixture of 4 siRNA constructs (Dharmacon) designed against the 3’ 
UTR of endogenous CPTP (GLTPD1) mRNA following standard manufacturer's 
protocol (Supplementary Methods). The 3’UTR was not included in the ectopi- 
cally expressed constructs to ensure siRNA targeting only to endogenous CPTP. 
Cells were incubated 48 h in regular media without G418 under standard incubator 
conditions. Full serum media was replaced with media containing 2% serum 15h 
before harvest. 

RNA isolation, RT-PCR and quantitative PCR. To evaluate downregulation of 
CERK and CPTP, quantitative PCR was performed”. Briefly, total RNA was 
isolated using RNeasy kits (Qiagen). Total RNA (1 lg) was reverse transcribed 
using Superscript III reverse transcriptase (Invitrogen). The level of CERK tran- 
script was monitored using quantitative PCR and TaqMan technology (Applied 
Biosystems) specific to CERK and CPTP with 18S rRNA as control. cDNA was 
amplified using an ABI 7900HT with premixed primer-probe sets and TaqMan 
Universal PCR master mix (Applied Biosystems). 

Intracellular sphingolipid analyses. Cell lipids were harvested using an improved 
Bligh-Dyer protocol”? (Supplementary Methods). Sphingolipids were separated by 
HPLC (Prominence HPLC system, Shimadzu) using a Kinetix-C18 column 
(50 X 2.1 mm, 2.6 im; Phenomenex) and eluted using a linear gradient (solvent 
A, methanol:water:formic acid (58:41:1) in 5mM ammonium formate; solvent B, 
methanol:formic acid (99:1) in 5mM ammonium formate, 20-100% B (3.5 min) 
and at 100% B (4.5 min); flow rate of 0.4 ml min‘, 60 °C). ESI-MS/MS (API 4000 
QTRAP instrument; Applied Biosystems, MDS Sciex) was used to detect C1P (ref. 29), 
ceramide, sphingosine, S1P, sphingomyelin, and monohexosyl ceramide under 
positive ionization (see Supplementary Methods). 

Eicosanoid analysis. Eicosanoids were analysed as detailed by the Lipid Maps 
Consortium**™. Culture media (4 ml) from siRNA was combined with 10% meth- 
anol (400 pl) and glacial acetic acid (20 ul) before spiking with internal standard 
(100 jl) containing the following deuterated eicosanoids (100-pg-jl_’, 10 ng total): 
(ds) 6keto-PGF«, (dy) PGFo0, (dy) PGEp, (ds) PGD2, (dg) 5-hydroxyeicosatetranoic 
acid (SHETE), (dg) 15-hydroxyeicosatetranoic acid (15HETE), (dg) 14,15 epox- 
yeicosatrienoic acid and (dg) arachidonic acid. Samples and vial rinses (5% MeOH; 
2 ml) were applied to Strata~-X SPE columns (Phenomenex), previously washed 
with methanol (2 ml) and then dH2O (2 ml). Eicosanoids eluted with isopropanol 
(2ml), were dried in vacuuo and reconstituted in EtOH:dH,O (50:50; 100 ul) 
before HPLC ESI-MS/MS analysis (see Supplementary Methods). 

Subcellular fractionation. Subcellular fractionation was performed by multi-step 
centrifugation as detailed and characterized previously” with minor modifications 
(see Supplementary Methods). Fraction enrichment was validated by SDS-PAGE/ 
western blotting (Supplementary Fig. 13b) using organelle markers for: nuclei 
(anti-lamin AC), trans-Golgi (anti-TGN46), ER (anti-protein disulphide isomer- 
ase (PDI)) and plasma membrane (anti-caveolin-1). 
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The histone H4 lysine 16 acetyltransferase hMOF 
regulates the outcome of autophagy 
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Autophagy is an evolutionarily conserved catabolic process involved 
in several physiological and pathological processes'”. Although pri- 
marily cytoprotective, autophagy can also contribute to cell death; it 
is thus important to understand what distinguishes the life or death 
decision in autophagic cells’. Here we report that induction of auto- 
phagy is coupled to reduction of histone H4 lysine 16 acetylation 
(H4K1é6ac) through downregulation of the histone acetyltransferase 
hMOF (also called KAT8 or MYST1), and demonstrate that this 
histone modification regulates the outcome of autophagy. At a 
genome-wide level, we find that H4K16 deacetylation is associated 
predominantly with the downregulation of autophagy-related genes. 
Antagonizing H4K16ac downregulation upon autophagy induc- 
tion results in the promotion of cell death. Our findings establish 
that alteration in a specific histone post-translational modification 
during autophagy affects the transcriptional regulation of autophagy- 
related genes and initiates a regulatory feedback loop, which serves as a 
key determinant of survival versus death responses upon autophagy 
induction. 

Autophagy is a catabolic process that results in the autophagosome- 
dependent lysosomal degradation of bulk cytoplasmic contents, abnor- 
mal protein aggregates and excess or damaged organelles’*. This process 
involves a series of dynamic membrane-rearrangements mediated by a 
core set of autophagy-related (ATG) proteins*. Although autophagy is 
primarily a protective process for the cell, it can also play a role in cell 
death’; however, it is not clear what distinguishes the life or death 
decision’. Protein acetylation, in particular acetylation of ULK1 and 
ATG proteins, has emerged as a regulator of autophagy*°. Accumula- 
ting evidence has established sirtuin 1 (SIRT1), a NAD * -dependent 
deacetylase, as a player in this process”!°. However, SIRT] is not always 
required for autophagy to occur. Indeed, although SIRT1 overexpres- 
sion is sufficient to increase the basal level of autophagy and is required 
for starvation-induced autophagy, it is not necessary for rapamycin- 
induced autophagy”"’. SIRT1 has a wide range of non-histone targets, 
but lysine 16 on histone H4 (H4K16) is its primary histone target’?”’. 
The histone acetyltransferase hMOF/KAT8/MYSTI is necessary and 
sufficient for the bulk of H4K16 acetylation and thereby antagonizes 
the enzymatic activity of SIRT1 (refs 13-15). Because SIRT1 has been 
linked to both autophagy and epigenetic chromatin changes, this 
encouraged us to investigate the role of covalent histone modifications 
in autophagy. As SIRT1 preferentially deacetylates H4K16ac’, we 
considered that this histone modification could be altered upon auto- 
phagy induction. 

We induced autophagy in mouse embryonic fibroblast (MEF) cells by 
amino-acid starvation and observed a decrease in acetylation of H4K16 
(Fig. 1a). To investigate whether the observed effect on H4K16ac was 
linked to the role of SIRT 1 during starvation-induced autophagy, or if the 
deacetylation of H4K16 is a general feature of the autophagic process, 
treatments with rapamycin or Torin 1, respectively allosteric and catalytic 


inhibitors of the kinase mechanistic target of rapamycin (MTOR), were 
used to induce SIRT1-independent autophagy (Supplementary Figs 2a 
and 3a). The global amount of H4K16ac was robustly reduced in MEF 
cells after those treatments (Fig. 1b and Supplementary Fig. 4a). 
Interestingly, in the sirt1 null MEF cells, both rapamycin and Torin 1 
treatments, but not amino-acid starvation, induced the downregulation 
of H4K1é6ac, confirming that SIRT1 is not required for the repression of 
this histone modification (Fig. 1c and Supplementary Figs 2b and 4b). 
The downregulation of H4Kl6ac upon autophagy induction 
occurred in various human cancer cell types, namely U1810, HeLa 
and U20OS cells (Fig. 1d-e and Supplementary Fig. 4c, d) and was even 
found to occur in yeast (Supplementary Fig. 5). Rapamycin treatment 
did not affect total histone H4 amounts (Supplementary Fig. 6). The 
changes in H4K16 acetylation status were linked to the occurrence 
of autophagy as established by an increased lipidation of the autopha- 
gic marker LC3, resulting in an increased ratio of the lipidated form 
(LC3-II) to the unlipidated form (LC3-I), referred to later as LC3 
conversion (Fig. lad and Supplementary Fig. 3a, b). Similarly, in yeast 
this treatment resulted in increased lipidation of Atg8 (yeast homo- 
logue of LC3), and both cleavage and vacuolar localization of green 
fluorescent protein (GFP)-tagged Atg8 (Supplementary Fig. 7a-c). 
Dynamic histone modifications play a pivotal role in cell regulatory 
events'° and the H4K 16 residue is of particular interest, as its acetylation 
influences higher-order chromatin structure’ and plays an important 
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Figure 1 | Autophagy is associated with reduced acetylation of histone H4 
lysine 16. a, Starvation (3 h)-induced autophagy results in a downregulation of 
H4K1é6ac in histone extracts of MEF cells. b, Upon rapamycin treatment 
(300 nM) LC3 conversion and downregulation of H4K16ac are observed in WT 
MEF cells but not in the autophagy-deficient atgs ‘~ and atg7 ‘~ MEF cells. 
c, Rapamycin treatment increased the LC3-II/LC3-I ratio and promoted 

H4K l6ac decrease in sirtl”’” and WT MEF cells. d, Rapamycin-induced 
autophagy led to downregulation of H4K16ac at 48h in histone extracts of 
HeLa and U20S cells, and after 6h in U1810 cells. e, Quantification of H4K16 
acetylation by immunoblotting is depicted for rapamycin-treated cells. Data are 
expressed as mean + s.e.m. (m = 3-5); *P < 0.05; **P < 0.01. 
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role in transcription’. Chromatin immunoprecipitation targeting 
H4Kl6ac, followed by high-throughput sequencing (ChIP-seq) was 
performed to investigate the genome-wide occurrence of this histone 
mark in U1810 cells undergoing autophagy (Fig. 2a). H4K16ac ChIP- 
seq data analysis showed 3,422 called peaks in untreated U1810 cells, 
which subsequently showed reduced H4K16ac occupancy after 8h 
rapamycin treatment. To gain insight into the role of this induced 
H4K16 deacetylation in the regulation of gene expression during auto- 
phagy, we performed a global run-on-sequencing (GRO-seq) assay’?”° 
to generate a genome-wide view of the location, orientation and den- 
sity of nascent transcripts engaged by RNA polymerases at high reso- 
lution in rapamycin-treated compared with untreated U1810 cells 
(Fig. 2b and Supplementary Fig. 8a). This approach unveiled a signifi- 
cant alteration of the U1810 transcriptome with the identification of 
1,622 significantly (fold change > 1.5 or < 0.75 and P< 0.001) up- or 
downregulated genes (Fig. 2b, c and Supplementary Fig. 8a). A large 
fraction of the identified genes (141 genes; 8.7%) were related to auto- 
phagy (Fig. 2c and Supplementary Fig. 8b). There is an overall coin- 
cidence across the autophagy-related genes between the alteration of 
the GRO-seq signal and the absence of H4K16 acetylation. Indeed, 55 
genes, that is 39% of the autophagy-related genes (including genes 
belonging to the autophagic core machinery) identified by GRO-seq 
analysis, showed reduced H4K16ac tag counts upon rapamycin treat- 
ment (Fig. 2c and Supplementary Fig. 9a, b). 

To provide further evidence that the downregulation of H4K16ac 
during autophagy is part of a specific program, three extra histone 
modifications were examined, namely H3K4me3, H4K12ac and 
H4K8ac. In fact, whereas H4K16ac and H3K4me3 are known to be 
associated histone marks*!*’, K8 and K12 acetylation amounts on 
histone H4 are reported to be independent of H4K16 acetylation’. 
In agreement with the established molecular link between H4K16ac 
and H3K4me3, rapamycin-induced autophagy was associated with a 
reduction in H3K4me3 (Supplementary Fig. 10a, b), whereas H4K12ac 
and H4K18ac were left unaffected (Supplementary Fig. 10c). The joint 
downregulation of the H4K16ac and the H3K4me3 histone modifica- 
tions was also observed in wild-type (WT) yeast, but not atg1A, atg5A 
and atg7A autophagy-deficient yeast or Sas2-overexpressing yeast (Sup- 
plementary 10d). Collectively, these genome-wide deep-sequencing 
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Figure 2 | Deacetylation of H4K16 by rapamycin treatment is associated 
with transcriptional regulation of autophagy-related genes. a, Heat map of 
H4Kl6ac ChIP-seq performed in U1810 cells without or with 8h rapamycin 
treatment. Data are shown as log, values of tag counts in the 3,422 regions defined 
as peaks in the control sample. b, We analysed de novo detection of transcripts 
using GRO-seq in 8h rapamycin-treated U1810 cells and compared them with 
untreated U1810 cells. GRO-seq data visualized as ‘MA’ plots (log ratio versus 
abundance). The plot shows GRO-seq gene expression for pairwise comparison 
between rapamycin-treated versus control cells. The red points denote the 
differentially expressed genes. c, Autophagy-related genes identified as regulated 
by rapamycin in the GRO-seq data analysis and in the ChIP-seq data analysis. 
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analyses indicate that the observed deacetylation of H4K16 during auto- 
phagy results in transcriptional regulation of autophagy-related genes. 

Because MTOR is involved in a wide variety of signalling pathways, 
treating cells with rapamycin or Torin 1 could cause the observed epi- 
genetic changes by mechanisms unrelated to autophagy. To exclude 
this possibility, we tested the effect of rapamycin on the H4K16ac 
histone modification in atg5- and atg7-deficient MEF cells. These 
genes encode two ATG proteins that are essential for the canonical 
autophagy pathway (Fig. 1b, lower panels). Treatment of autophagy- 
deficient cells, namely atg7 ‘~ or atg5 ‘~ MEFs, with rapamycin did 
not lead to a similar degree of downregulation of H4K16ac (Fig. 1b, 
upper panels). Identical effects were observed in yeast (Supplementary 
Fig. 5). 

Thus, the process of autophagy, independent of whether its induc- 
tion required a SIRT1-dependent signalling pathway, was associated 
with deacetylation of H4K16. Collectively, these data suggest that 
alteration in another histone-modifying enzyme should be responsible 
for the observed modification in the acetylation status of H4K16. 
This observation prompted us to examine the status of hMOF during 
autophagy. Interestingly, although SIRT1 expression was not signifi- 
cantly altered upon rapamycin treatment (Supplementary Fig. 11a), 
hMOF expression was effectively downregulated in mammalian cells 
(Fig. 3a, b and Supplementary Figs 3a—c and 11a). Similarly, robust 
downregulation of hMOF expression was observed upon Torin 1 
treatment or under amino-acid starvation (Supplementary Fig. 3a—d). 
Remarkably, the observed downregulation of hMOF upon rapamycin, 
Torin 1 or starvation treatment was abrogated when cells were co- 
treated with inhibitors of autophagy such as chloroquine (CQ), or 
3-methyladenine (3MA) (Supplementary Fig. 3a—-c). In yeast cells 
engineered to express an haemagglutinin (HA)-tagged version of 
the yeast homologue of hMOF, Sas2, rapamycin treatment induced 
a nearly complete loss of the HA signal within 3h (Fig. 3g). 
Overexpression of Sas2 repressed the downregulation of H4K16ac 
upon rapamycin treatment in yeast cells (Fig. 3h). Altogether, these 
results showed that the downregulation of hMOF is part of the auto- 
phagy program. 

The observed pronounced changes in amounts of H4K16 acetyla- 
tion and associated transcriptional gene regulation suggested that there 
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Figure 3 | Rapamycin-induced hMOF downregulation promotes 
deacetylation of H4K16. Rapamycin treatment (48 h) promoted the 
downregulation of the H4K16 histone acetyltransferase hMOF expression level 
in MEF (a) and transfected HeLa cells (b). VPA (1 mM) treatment counteracted 
rapamycin-induced H4K16ac downregulation (c) and decreased the LC3 ratio 
(d). e, Co-treatment with chloroquine (CQ, 10 UM) showed that the decrease in 
LC3 ratio was a result of an increase in autophagic flux. f, Inhibition of 
autophagy by CQ after hMOF overexpression shows that hMOF does not 
inhibit autophagic flux. g, The yeast homologue of hMOF, Sas2, tagged with 3 x 
HA showed a complete disappearance of HA signal upon autophagy induction 
after 3h. h, Overexpression of Sas2 repressed the downregulation of H4K16ac 
upon rapamycin treatment. 
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may bea functional role for this epigenetic change during autophagy. It 
has only recently become clear how histone modifications can play a 
regulatory role in apoptosis and how they can influence the decision 
between life and death (reviewed in ref. 25). A similar regulatory role 
for histone modifications could be present during autophagy. Shifting 
the equilibrium of hMOF and SIRT1 expression in favour of SIRT1 
leads to a decrease in acetylation of H4K16 (ref. 13). Treatment with 
valproic acid (VPA) increased the acetylation status of H4K16 (Fig. 3c 
and Supplementary Fig. 11b-d) by reducing SIRT1 amounts”’. Treat- 
ment with VPA was not only able to reverse rapamycin-induced down- 
regulation of the H4K16ac histone modification (Fig. 3c), but also 
promoted LC3 conversion and turnover (Fig. 3d and Supplementary 
Fig. 1le). Administration of chloroquine to inhibit lysosomal activity 
enhanced the rapamycin-induced increase in LC3-II amounts. Co- 
treatment with VPA led to a further increase in the LC3-II amounts, 
confirming the increased autophagic flux in those cells (Fig. 3e and 
Supplementary Fig. 11f). Combined treatment with bafilomycin A; 
(BafA), which prevents maturation of autophagic vacuoles by inhi- 
biting fusion between autophagosomes and lysosomes, led to similar 
results (Supplementary Fig. 12). The increase in the autophagic flux in 
those cells was further confirmed, making use of a tandem reporter 
construct, mRFP-GFP-LC3 (ref. 26) (Fig. 4a, b and Supplementary 
Figs 13a, 14 and 15). Rapamycin treatment resulted in an increase of 
yellow-colour-labelled LC3 puncta in U1810 and HeLa cells. In con- 
trast, a remarkable increase in punctate red fluorescent signals was 
detected upon VPA and rapamycin treatment in both cell types 
(Fig. 4a, b and Supplementary Figs 13a and 14). Similarly, hMOF 
overexpression in HeLa and U1810 cells correlated with increased 
autophagic flux upon rapamycin treatment, as illustrated by LC3 
immunoblotting (Fig. 3b, f) and the mRFP-GFP-LC3 autophagic flux 
assay (Supplementary Fig. 15). 

We extended our investigation to the analysis of cell death. We 
observed a significant increase in cell death in human-derived cell lines 
co-treated with rapamycin, and VPA or the SIRT 1-specific inhibitor 
Ex527 (ref. 13), as demonstrated by the appearance of condensed or 


fragmented nuclei (Fig. 4c, fand Supplementary Figs 13b, e and 16a, b). 
These results were further confirmed by fluorescence-activated cell 
sorting analysis of the appearance of a sub-G1 hypodiploid DNA peak 
(Supplementary Fig. 17a, b). In agreement with the observed effect of 
SIRT1 chemical inhibitors on the outcome of autophagy, we observed 
a significant increase in cell death in HeLa or U1810 cells upon SIRT1- 
knockdown and rapamycin treatment (Fig. 4e and Supplementary 
Fig. 13d). To investigate whether the observed cell death upon abrogation 
of H4K16 deacetylation after rapamycin treatment was a consequence of 
autophagy induction, we performed an extra set of experiments with 
CQ, BafA and 3MA, which inhibit different steps in the autophagic 
pathway. We noted that co-treatment with these inhibitors of auto- 
phagy was able to abrogate both rapamycin + VPA- and rapamycin + 
Ex527-induced cell death in human cancer cells (Fig. 4d, g, h and Sup- 
plementary Figs 13c and 16c-e). Knockdown of ATG7 expression in 
mammalian cells prevented rapamycin + VPA-induced cell death and 
thus further strengthens the conclusion about an autophagy-mediated cell 
death (Supplementary Fig. 18). It is worth noting that the increased cell 
death upon VPA addition was not limited to rapamycin-induced auto- 
phagy, and was observed in amino-acid-starved HeLa cells (Fig. 4i, j). 
Furthermore, we investigated the link between hMOF activity and the 
outcome of autophagy. In agreement with the discovery that the perturba- 
tion of H4K16 acetylation status regulates the outcome of autophagy, we 
observed a significant increase in cell death in hMOF-overexpressing 
HeLa or U1810 cells upon rapamycin treatment, mimicking the effect 
of VPA/rapamycin co-treatment (Fig. 4f and Supplementary Fig. 13e). 
Collectively, these data indicate that the downregulation of hMOF, the 
associated reduction in H4K16 acetylation level and transcriptional regu- 
lation of autophagy-related genes are required for the proper progression 
of the autophagic process, and that disturbance of this epigenetic program 
results in cell death. 

In conclusion, until now, nuclear events have not been considered of 
primary importance for autophagy, as enucleated cells are still able 
to accumulate GFP-LC3 puncta in response to autophagic stimuli’®. 
Our data, however, unveil a critical link of the induction of autophagy 
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Figure 4 | Inhibition of H4K16ac downregulation upon autophagy 
induction results in cell death. a, VPA increases the autophagic flux in 
rapamycin-treated HeLa cells transfected with the mRFP-GFP-LC3 tandem 
reporter construct which allows distinction between autophagosomes (GFP +/ 
REP+ yellow puncta) and autolysosomes (GFP—/RFP+ red puncta). 

b, Confocal microscopy image of a cell treated with rapamycin and VPA 
depicting a high ratio of red to green LC3 puncta indicating an increase in 
autophagic flux. c, Co-treatment with VPA and rapamycin led to increased cell 
death. d, Co-treatment with CQ abrogated VPA + rapamycin-induced cell 
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death. e, f, Increasing H4K16ac levels by either overexpression of hMOF, 
inhibition of SIRT1 by siRNA or the chemical inhibitor Ex527 promoted cell 
death upon rapamycin treatment. Co-treatment with the autophagy inhibitor 
bafilomycin A, (BafA, 40 nM) (g) or 3-methyladenine (3MA, 5 mM) 

(h) abrogated VPA + rapamycin-induced cell death. Treatment of HeLa cells 
upon amino-acid starvation with VPA induced cell death (i), which was 
rescued when cells were co-treated with CQ (j). Data are expressed as 

mean + s.e.m. (n = 3). 
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and covalent histone H4K16 modifications with altered gene 
expression, including the regulation of key genes in the autophagy 
program. Our findings imply a molecular histone switch, where the 
balancing effects of hMOF and SIRT1 on H4K16 acetylation regulate 
autophagy (Supplementary Fig. 1). Our results do not oppose the 
findings about functionality of the autophagic process in enucleated 
cells, but add a new feedback regulatory network influencing the out- 
come of autophagy with respect to cell death/survival. The identifica- 
tion of tightly regulated histone modifications associated with the 
autophagic process offers an attractive conceptual framework to 
understand the short-term transcriptional response to stimuli eliciting 
autophagy, as well as constituting a potential aspect of long-term 
responses to autophagy. 


METHODS SUMMARY 


Antibodies and reagents used in this study are listed in Supplementary Tables 1 
and 2. ON-TARGET plus SMARTpools short interfering RNAs (siRNAs) were 
purchased from Dharmacon (Supplementary Table 3). Experiments were per- 
formed on U1810, U2OS and HeLa human cancer cells and wild-type, atgs ', 
atg7'~ and sirt!‘~ mouse embryonic fibroblasts as well as wild-type, atg14, 
atg5A and atg7Ad SEY6210 yeast cells. Histone protein extracts, total protein 
extracts and immunoblotting were performed as reported previously'**”**. ChIP- 
seq and GRO-seq analyses were executed as described elsewhere’”’®. Methods to 
monitor autophagy flux follow ref. 29. Statistical evaluations were performed by 
Student’s t-test. 


Full Methods and any associated references are available in the online version of 
the paper. 
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METHODS 


Cell culture and transfection. Non-small-cell lung carcinoma U1810 cells, osteo- 
sarcoma U20S cells, cervical cancer HeLa cells and wild-type, atgs / ng atg7 ! ~ and 
sirt!_'~ mouse embryonic fibroblasts were cultured using standard procedures'*”. 
The atgS ‘~ and WT MEF cell lines were gifts from G. Mc Inerney, the atg7 /~ 
MEE cell line was a gift from M. Komatsu’ and the sirt1~/~ MEF cell line was a gift 
from X. Li. SEY6210 wild-type, atg14, atg5A and atg7A yeast cells were grown as 
described previously*’. Reagents used in this study are listed in Supplementary 
Table 2. Plasmids encoding HA-human MOF, Flag-human SIRT1 and mRFP- 
GFP-LC3 were gifts from R. G. Roeder, L. Guarente and G. Mc Inerney, respectively. 
Transfection of cells used lipofectamine and lipofectamine reagent in HeLa cells and 
X-tremeGENE HP DNA transfection reagent in U1810 cells. 

Histone extracts and immunoblotting. Histone protein extracts were performed 
as described elsewhere'*”’ using TCA precipitation and H,SO, extraction or using 
the Histone Purification Mini Kit (Active Motif). Total protein extracts and immu- 
noblotting were performed as reported previously**. GAPDH, f-actin, histone 3 
(H3) and phosphoglycerate kinase 1 (Pgk1) were used as standards for equal 
loading of protein. Antibodies used in this study are listed in Supplementary 
Table 1. Densitometry was done using Image]. 

Yeast procedures. The GFP-Atg8 processing assay and fluorescence microscopy 
were performed as described previously”. If not stated otherwise, all experiments 
were performed at 1h treatment. 

Immunofluorescence and confocal microscopy. For confocal microscopy ana- 
lysis, the adherent mammalian cells were grown on coverslips. Paraformaldehyde- 
fixed cells were blocked in HEPES, 3% bovine serum albumin, 0.3% Triton X-100 
and incubated with primary (4 °C, overnight) and secondary (room temperature, 
1h) antibodies. Samples were mounted with Vectashield (Vector Laboratories) 
and analysed with Zeiss 510 META confocal laser scanning microscopy (Zeiss)”*. 
Evaluation of autophagic flux. Autophagy was assessed following the ‘Guidelines 
for the use and interpretation of assays for monitoring autophagy’. To monitor 
the autophagic flux, a tandem reporter construct mRFP-GFP-LC3 was used”*. The 
green fluorescence of this tandem reporter is attenuated in the acidic pH lysosomal 
environment, whereas the mREP is not. Therefore, the green fluorescent compon- 
ent of the composite yellow fluorescence from this mRFP-GFP-LC3 reporter is 
lost upon autophagosome fusion with a lysosome, whereas the red fluorescence 
remains detectable. Thus this probe allows distinction between autophagosomes 
(GFP+/RFP+ yellow puncta) and autolysosomes (GFP—/RFP+ red puncta). At 
24h after plating, the cells were transfected with the mRFP-GFP-LC3 plasmid, 
alone or in combination with SIRT1 siRNA, control siRNA or hMOF plasmid. ON- 
TARGET plus SMARTpools siRNAs were purchased from Dharmacon (Sup- 
plementary Table 3). The next day, cells were treated for an extra 24h with the 
indicated compound(s). Cells were then fixed using 4% paraformaldehyde, and 
autophagy was determined by quantification of the number of cells with LC3- 
positive organelles, counting at least 100 cells in triplicate per condition. The 
presence of autophagic vacuoles expressing endogenous LC3 was also assessed. 
Cell death quantification. After treatment, cells were fixed in 4% paraformalde- 
hyde, collected and cytospins were prepared. Subsequently, DNA was stained with 
Hoechst 33342 (0.1 mg ml’; Molecular Probes/Invitrogen). The number of dying 
cells was measured quantitatively by assessing the percentage of cells with frag- 
mented, damaged or condensed nuclei. 

Fluorescence-activated cell sorting analysis. Quantification of PI (Sigma) and 
TMRE (Molecular Probes/Invitrogen) staining was performed with a FACSCalibur 
flow cytometer (Becton Dickinson) using standard procedures”. 
ChIP-sequencing. For ChIP-seq analysis, 5 yg of chromatin was used in two 
separate immunoprecipitations and combined in one elution for each condition. 
Subsequently, the DNA sequencing library was made using a kit from Illumina 
(catalogue number 1003473) except that Illumina TruSeq adaptors (to enable 
multiplexing) were used. The library was analysed by Solexa/Illumina Hi-seq. 
After pre-filtering the raw data by removing sequenced adapters and low quality 
reads, the sequence tags were aligned to the human genome (assembly hg19) with 
the Bowtie alignment tool’’. To avoid any PCR-generated spikes we allowed only 
one read per chromosomal position, thus eliminating PCR bias. From the filtered 
raw data, 8 million unique reads per sample were used for peak detection. Peak 
detection was performed using the CisGenome program* with a two-sample 


analysis where sequenced input (1%) was used as a negative control. Peaks were 
called with a window statistic cut-off of 3 and a log,-fold change of 2. Using the 
defined chromosomal peak regions from the no-treatment condition, the number 
of tags were counted in the corresponding rapamycin-treated sample and heat 
maps were generated using Java Treeview”. 

GRO-sequencing. GRO-seq experiments were performed as_ previously 
reported'*”’. Briefly, cells were washed with cold 1X PBS buffer and swelled in 
swelling buffer (10 mM Tris-Cl pH 7.5, 2mM MgCl;, 3 mM CaCl) for 5 min on 
ice and collected. Cells were lysed in lysis buffer (swelling buffer with 0.5% NP-40, 
2 units per millilitre Superase In and 10% glycerol) and finally re-suspended in 
100 ul of freezing buffer (50mM Tris-Cl pH 8.3, 40% glycerol, 5mM MgCh, 
0.1mM EDTA). For the run-on assay, re-suspended nuclei were mixed with an 
equal volume of reaction buffer (10 mM Tris-Cl pH 8.0, 5 mM MgCh, 1mM DTT, 
300mM KCl, 20 units of SUPERase In, 1% sarkosyl, 500 14M ATP, GTP, and 
Br-UTP, 241M CTP) and incubated for 5min at 30°C. The nuclear-run-on 
RNA (NRO-RNA) was then extracted with TRIzol LS reagent (Invitrogen) fol- 
lowing the manufacturer’s instructions. NRO-RNA was then subjected to base 
hydrolysis on ice for 40 min and followed by treatment with DNase I and antarctic 
phosphatase. To purify the Br-UTP-labelled nascent RNA, the NRO-RNA was 
immunoprecipitated with anti-BrdU argarose beads (Santa Cruz Biotech) in bind- 
ing buffer (0.5 SSPE, 1mM EDTA, 0.05% Tween-20). To repair the end, the 
immunoprecipitated BrU-RNA was re-suspended in a 50 pl reaction (45 pl DEPC 
water, 5.2 pl T4 PNK buffer, 1 pl SUPERase In and 1 ul T4 PNK (NEB)) and 
incubated at 37 °C for 1h. The RNA was extracted and precipitated using acidic 
phenol-chloroform. The complementary DNA (cDNA) synthesis was performed 
basically as in ref. 36 with a few minor modifications. First, RNA fragments were 
subjected to a poly-A tailing reaction by poly-A polymerase (NEB) for 30 min at 
37 °C. Subsequently, reverse transcription was performed using oNTI223 primer 
(5'-pGATCGTCGGACTGTAGAACTCT; CAAGCAGAAGACGGCATACGA 
TITITTTTTTTTTTTTTTTITVN). Second, tailed RNA (8.0 ul) was subjected 
to reverse transcription using superscript III (Invitrogen). The cDNA products 
were separated on a 10% polyacrylamide TBE-urea gel and the extended first- 
strand product (100-500 base pairs) was excised and recovered by gel extraction. 
After that, the first-strand cDNA was circularized by CircLigase (Epicentre) and 
relinearized by Apel (NEB). Relinearized single strand cDNA (sscDNA) was 
separated on a 10% polyacrylamide TBE gel and the product of needed size was 
excised (~120-320 base pairs) for gel extraction. Finally, single strand cDNA 
template was amplified by PCR using the Phusion High-Fidelity enzyme (NEB) 
according to the manufacturer’s instructions with two oligonucleotide primers, 
oNTI200 (5’-CAAGCAGAAGACGGCATA) and oNTI201 (5’-AATGATACGG 
CGACCACCGACAGGT TCAGAGTTCTACAGTCCGACG). DNA was then 
sequenced on the Illumina HiSeq2000 according to the manufacturer’s instruc- 
tions, using the small RNA sequencing primer 5'-CGACAGGTTCAGAGTTCTA 
CAGTCCGACGATC. 

Database analyses. The following autophagy databases were used to indentify 
autophagy-related genes: the human autophagy database at http://autophagy.lu/ 
and the autophagy database at http://www.tanpaku.org/autophagy/overview.html. 
Statistical analyses. Bars and error bars represent mean with s.e.m. Statistical 
evaluations were performed by Student’s t-test. 
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Optical control of mammalian endogenous 
transcription and epigenetic states 


Silvana Konermann!?*, Mark D. Brigham!”**, Alexandro E. Trevino'’, Patrick D. Hsu’?*, Matthias Heidenreich!?, Le Cong!, 
Randall J. Platt’?, David A. Scott>*, George M. Church’® & Feng Zhang’? 


The dynamic nature of gene expression enables cellular program- 
ming, homeostasis and environmental adaptation in living systems. 
Dissection of causal gene functions in cellular and organismal pro- 
cesses therefore necessitates approaches that enable spatially and tem- 
porally precise modulation of gene expression. Recently, a variety of 
microbial and plant-derived light-sensitive proteins have been engi- 
neered as optogenetic actuators, enabling high-precision spatiotem- 
poral control of many cellular functions'"'. However, versatile and 
robust technologies that enable optical modulation of transcrip- 
tion in the mammalian endogenous genome remain elusive. Here 
we describe the development of light-inducible transcriptional effec- 
tors (LITEs), an optogenetic two-hybrid system integrating the cus- 
tomizable TALE DNA-binding domain’* “ with the light-sensitive 
cryptochrome 2 protein and its interacting partner CIB1 from 
Arabidopsis thaliana. LITEs do not require additional exogenous 
chemical cofactors, are easily customized to target many endogenous 
genomic loci, and can be activated within minutes with reversibility®”. 
LITEs can be packaged into viral vectors and genetically targeted to 
probe specific cell populations. We have applied this system in pri- 
mary mouse neurons, as well as in the brain of freely behaving mice 
in vivo to mediate reversible modulation of mammalian endogen- 
ous gene expression as well as targeted epigenetic chromatin modi- 
fications. The LITE system establishes a novel mode of optogenetic 
control of endogenous cellular processes and enables direct testing 
of the causal roles of genetic and epigenetic regulation in normal bio- 
logical processes and disease states. 

The LITE system uses a modular design consisting of two indepen- 
dent components (Fig. 1a). The first component is the genomic anchor 
and includes a customizable DNA-binding domain, based on trans- 
cription activator-like effectors (TALEs)'*”* from Xanthomonas sp., fused 
to the light-sensitive cryptochrome 2 (CRY2) protein from Arabidopsis 
thaliana®’’ (TALE-CRY2). The second component includes the inter- 
acting partner of CRY2, CIB1®", fused to a desired effector domain 
(CIB1-effector). In the absence of light (inactive state), TALE-CRY2 
binds the promoter region of the target gene while CIB1-effector remains 
free within the nuclear compartment. Illumination with blue light triggers 
a conformational change in CRY2 and subsequently recruits CIB1- 
effector (VP64 shown in Fig. 1a) to the target locus to mediate transcrip- 
tional modulation. This modular design allows each LITE component 
to be independently engineered, allowing the same genomic anchor to 
be combined with activating or repressing effectors'®’” to exert positive 
and negative transcriptional control over the same endogenous geno- 
mic locus. In principle, the genomic anchor may also be replaced with 
other DNA-binding domains such as zinc-finger proteins'® or RNA- 
guided DNA-binding domains based on nucleolytically inactive mutants 
of Cas9 (Extended Data Fig. 1)'*”. 

To identify the most effective architecture, we assessed the efficacy 
of different LITE designs by measuring transcriptional changes of the 


neural lineage-specifying transcription factor neurogenin 2 (Neurog2) 
induced by blue light illumination (Fig. 1b). Three out of four initial LITE 
pairings produced significant light-induced Neurog2 messenger RNA upre- 
gulation in Neuro 2a cells (P < 0.001, Fig. 1b). Of these, TALE(Neurog2)- 
CRY2PHR complexed with CIB1-VP64 (TALE(Neurog2)-CRY2PHR/ 
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Figure 1 | Design and optimization of the LITE system. a, Schematic of the 
LITE system. Light stimulation induces dimerization of CRY2 and CIB1, 
recruiting the effector to the target promoter. b, LITE architecture was 
optimized by fusing TALE and the transcriptional activator VP64'*"* to 
different truncations of CRY2 and CIB1° (n next to each bar). c, Time-course of 
light-dependent Neurog2 upregulation and decay post-illumination (n = 4 
biological replicates; *P < 0.05; ***P < 0.001). Cells were stimulated with 
5mWcm * light (466 nm, 1s pulses at 0.067 Hz). Mean * s.e.m. in all panels. 
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CIB1-VP64), containing a truncated CRY2 consisting of the photo- 
lyase homology region alone (CRY2PHR’, amino acids 1-498), yielded 
strongest light-mediated transcription activation as well as the highest 
induction ratio (light/no light mRNA levels). Therefore TALE-CRY2PHR/ 
CIB1-VP64 was used in subsequent experiments. To ensure optimal 
function, we also systematically tuned light stimulation parameters 
(wavelength”’, Extended Data Fig. 2a; duty cycle, Extended Data Fig. 2b; 
light intensity, Extended Data Fig. 2c, d and Supplementary Discus- 
sion; and choice of activation domain, Extended Data Fig. 2e). 

Although the interaction between CRY2 and CIB1 occurs on a sub- 
second timescale®, LITE-mediated transcriptional activation is likely to 
be dependent on many factors, including rate of transcription, mRNA 
processing and transcript stability***°. We found that LITE-mediated 
Neurog2 expression increased considerably as early as 30 min after ini- 
tial stimulation and rose steadily until saturating at 12 h with approxi- 
mately 20-fold upregulation compared to green fluorescent protein 
(GFP)-transfected negative controls (Fig. 1c). Interestingly, Neurog2 
transcript levels continued to increase for up to 30 min post-illumination, 
an effect that may have resulted from residual CRY2PHR-CIB1 dime- 
rization, which has been shown to reach complete dissociation after 15 
min (ref. 6), or from previously recruited RNA polymerases. There- 
after, Neurog2 mRNA returned to baseline levels with a half-life of 
~3h. In contrast, a small-molecule inducible TALE system based on 
the plant hormone abscisic acid receptor” showed slower on- and off- 
kinetics (Extended Data Fig. 3). 

To apply LITEs to neuronal applications, we developed an adeno- 
associated virus (AAV)-based vector (Fig. 2a, b) for the delivery of 
TALE genes and a simplified process for AAV production (Extended 
Data Fig. 4 and Methods). The single-stranded DNA-based genome of 
AAV is less susceptible to recombination, providing an advantage over 
lentiviral vectors”’. We evaluated a panel of 28 TALE activators target- 
ing the mouse genome in primary neurons and found that most were 
able to upregulate transcription in primary neurons (Fig. 2c). More- 
over, in vivo expression of TALE(Grm2)-VP64 in the prefrontal cortex 
(PFC) (Fig. 2d, e) induced a 2.5-fold increase in Grm2 mRNA levels 
compared to GFP-only controls (Fig. 2f). 

Similarly, we introduced LITEs into primary cortical neurons via co- 
delivery of two AAVs (Fig. 3a, b). We tested a Grm2-targeted LITE at 
two light pulsing frequencies with a reduced duty cycle of 0.8% to ensure 
neuron health (Extended Data Fig. 5a). Both stimulation conditions 
achieved an approximately sevenfold light-dependent increase in Grm2 
mRNA levels (Fig. 3c). Further study verified that substantial target 
gene expression increases could be attained quickly (fourfold upregula- 
tion of mRNA within 4 h; Fig. 3d). In addition, we observed significant 
upregulation of mGluR2 protein after stimulation, confirming that LITE- 
mediated transcriptional changes are translated to the protein level 
(P<0.01 vs GFP control, P< 0.05 vs no-light condition; Fig. 3e). To 
test the in vivo functionality of the LITE system, we stereotactically deli- 
vered a 1:1 mixture of high-titre AAV vectors carrying TALE(Grm2)- 
CIB1 and CRY2PHR-VP64 into the PFC. We used a previously established 
fibre optic cannula system to deliver light to LITE-expressing neurons 
in vivo (Fig. 3f, g and Extended Data Fig. 5b)”*. After 12h of stimu- 
lation, we observed a significant increase in Grm2 mRNA compared 
with unstimulated PFC (Fig. 3h, P = 0.01). Taken together, these results 
confirm that LITEs enable optical control of endogenous gene expres- 
sion in cultured neurons and in vivo. 

Given persistent baseline upregulation in vivo, we undertook further 
rounds of optimization to reduce background activity and improve the 
gene induction ratio. We observed that TALE(Grm2)-CIB1 alone pro- 
duced similar levels of upregulation as LITE background activation, 
whereas CRY2PHR-V P64 alone did not significantly affect transcription 
(Extended Data Fig. 5c). Therefore we rationalized that LITE-dependent 
background transcriptional activation arises mainly from TALE-CIB1. 

We subsequently designed a comprehensive screen to reduce base- 
line TALE-CIB1-mediated upregulation, focusing on two strategies. 
First, although CIB1 is a plant transcription factor, it may have 
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Figure 2 | In vitro and in vivo AAV-mediated TALE delivery targeting 
endogenous loci in neurons. a, Schematic AAV vectors for TALE delivery. 
b, Representative images of primary cortical neurons expressing TALE-VP64. 
c, TALE-VP64 constructs targeting a variety of endogenous neuronal genes 
were screened for transcriptional activation in primary cortical neurons 

(*P < 0.05; **P < 0.01; ***P < 0.001; n = 3 biological replicates). d, 
TALE-VP64 expression in PFC. DAPI, 4’,6-diamidino-2-phenylindole; Cg1, 
cingulate cortex area 1; PLC, prelimbic cortex; ILC, infralimbic cortex. e, Higher 
magnification image of TALE-VP64-expressing neurons in PFC. f, Grm2 
mRNA upregulation by TALE-VP64 in vivo in PFC (n = 4 animals). 

Mean = s.e.m. in all panels. 


intrinsic activity in mammalian cells'*. To address this, we deleted 
three CIB1 regions conserved amongst basic helix-loop-helix tran- 
scription factors of higher plants (Extended Data Fig. 6). Second, to 
prevent TALE-CIB1 from binding the target locus in absence of light, 
we engineered TALE-CIB1 to localize in the cytoplasm pending light- 
induced dimerization with the nuclear localization signal (NLS)- 
containing CRY2PHR-VP64 (Extended Data Fig. 7a and b). To test 
both strategies independently or in combination, we evaluated 73 dis- 
tinct LITE architectures and identified 12 effector-targeting domain 
pairs (denoted by the ‘+’ column in Extended Data Fig. 6) with both 
improved light-induction efficiency and reduced background (fold 
mRNA increase in the no-light condition compared with the original 
LITE; P < 0.05). One architecture successfully incorporating both strat- 
egies, designated LITE2.0, demonstrated the strongest light induction 
(light/no-light = 20.4) and resulted in greater than sixfold reduction of 
background activation compared with the original design (Fig. 3i). 
Another, LITE1.9.1, produced minimal background activation (1.06) 
while maintaining fourfold light induction (Extended Data Fig. 7c). 
Finally, we sought to expand the range of transcriptional processes 
addressable by TALEs and LITEs. We reasoned that TALE-mediated 
targeting of histone effectors to endogenous loci could induce specific 
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Figure 3 | LITE-mediated optogenetic modulation of endogenous 
transcription in primary neurons and in vivo. a, Schematic of AAV-LITE 
constructs. b, Images of primary neurons expressing LITE constructs. HA, 
haemagglutinin tag. c, Light-induced activation of Grm2 in primary neurons 
after 24h of stimulation (250 ms pulses at 0.033 Hz or 500 ms pulses at 

0.016 Hz;5mWcm 7;n=4 biological replicates). d, Upregulation of Grm2 in 
primary cortical neurons after 4h or 24h of stimulation. Expression levels are 
shown relative to neurons transduced with GFP only (number of biological 


epigenetic modifications, which would enable the interrogation of epi- 
genetic as well as transcriptional dynamics (Fig. 4a)”. We fused CRY2PHR 
with four concatenated mSin3 interaction domains (SID4X; Fig. 4b 
and Extended Data Fig. 8) and observed light-mediated transcrip- 
tional repression of Grm2 in neurons (Fig. 4c) accompanied by an 
approximately twofold reduction in H3K9 acetylation at the targeted 
Grm2 promoter (Fig. 4d). To expand the diversity of histone residue 
targets for locus-specific histone modification, we next derived a set of 
32 repressive histone effector domains (Supplementary Tables 1-5). 
Selected from across a wide phylogenetic spectrum, the domains include 
histone deacetylases (HDACs), methyltransferases (HMTs), acetyl- 
transferase (HAT) inhibitors, as well as HDAC and HMT recruiting 
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replicates denoted within graph bars). e, Light-mediated changes in mGluR2 
protein levels (n = 7 biological replicates). f, Schematic of in vivo optogenetic 
stimulation setup. g, Representative images of PFC neurons expressing both 
LITE components. h, Light-induced activation of endogenous Grm2 expression 
using LITEs transduced into ILC. (**P < 0.01; number of animals denoted 
within graph bars.) i, LITE2.0 significantly reduces the level of background 
activation in Neuro 2a cells (m = 3 biological replicates). Mean + s.e.m. in 

all panels. 


proteins. Preference was given to proteins and functional truncations 
of small size to facilitate efficient AAV packaging. The resulting epigenetic 
mark-modifying TALE-histone effector fusion constructs (epiTALEs) 
were evaluated in primary neurons and Neuro 2a cells for their ability 
to repress Grm2 and Neurog2 transcription, respectively (Fig. 4e, fand 
Extended Data Fig. 9). In primary neurons, 23 out of 24 epiTALEs suc- 
cessfully repressed transcription of Grm2 (P < 0.05). Similarly, epiT ALE 
expression in Neuro 2a cells led to decreased Neurog2 expression for 
20 of the 32 histone effector domains tested (Extended Data Fig. 9; 
P< 0.05). We then expressed a subset of promising epiT ALEs in pri- 
mary neurons and Neuro 2a cells and quantified the relative histone resi- 
due mark levels at the target locus using chromatin immunoprecipitation 
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Figure 4 | TALE- and LITE-mediated epigenetic modifications. a, LITE 
epigenetic modifiers (epiLITE). b, epiLITE AAV vectors. ¢, epiLITE-mediated 
repression of endogenous Grm2 in neurons (n = 4 biological replicates). 

d, epiLITE-mediated decrease in H3K9 histone acetylation at the Grm2 
promoter (n = 4 biological replicates). e, f, epi TALE methyltransferases 
mediated decrease in Grm2 mRNA and corresponding enrichment of 
H3K9mel, H4K20me3 and H3K27me3 at the Grm2 promoter (n = 3 biological 
replicates). g, h, epiT ALE histone deacetylases mediated repression of Grm2 
and corresponding decreases in H4K8Ac and H3K9Ac marks at the Grm2 
promoter (n denoted within graph). Mean + s.e.m. in all panels. 


followed by reverse transcription and quantitative PCR (ChIP-qRT- 
PCR; Fig. 4g, h and Extended Data Fig. 10). In primary neurons and 
Neuro 2a cells, epiT ALE-mediated modifications were observed for the 
following histone marks: H3K9mel (KYP (A. thaliana)), H4K20me3 
(TgSET8 (Toxoplasma gondii)), H3K27me3 (NUE and PHF19 (Chla- 
mydia trachomatis and Homo sapiens)), H3K9ac (Sin3a, Sirt3 and NcoR 
(all H. sapiens)) and H4K8ac (HDAC8, RPD3 and Sir2a (Xenopus laevis, 
Saccharomyces cerevisiae and Plasmodium falciparum)). These domains 
provide a ready source of effectors for LITE-mediated control of spe- 
cific epigenetic modifications. 

Spatiotemporally precise perturbation of transcription and epige- 
netic states in vivo using LITE can enable researchers to test the causal 
role of gene regulation in diverse processes including development, 
learning and disease. TALEs can be conveniently customized to target 
a wide range of genomic loci, and other DNA-binding domains such as 
the RNA-guided Cas9 enzymes may be used in lieu of TALE to enable 
multiplexed transcriptional and epigenetic engineering of individual 
or groups of genomic loci in cells and whole organisms'*”°. Novel modes 
of LITE modulation can also be achieved by replacing the effector module 
with functional domains such as chromatin-modifying enzymes”. 
The LITE system enables a powerful set of novel capabilities for the 
optogenetic toolbox and establishes a highly generalizable and versa- 
tile platform for reverse-engineering the function and regulation of 
mammalian genomes. 


METHODS SUMMARY 


LITE constructs were transfected into Neuro 2a cells using GenJet. AAV vectors car- 
rying TALE or LITE constructs were used to transduce mouse primary embryonic 
cortical neurons as well as the mouse brain in vivo. RNA was extracted and reverse 
transcribed and mRNA levels were measured using TaqMan-based qRT-PCR. 
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Light emitting diodes or solid-state lasers were used for light delivery in tissue 
culture and in vivo, respectively. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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METHODS 


Design and construction of LITEs. All LITE construct sequences can be found in 
the Sequences section of the Supplementary Information. We evaluated full-length 
CRY2 as well as a truncation consisting of the photolyase homology region alone 
(CRY2PHR, amino acids 1-498)°. For CIB1, we tested the full-length protein as 
well as an amino-terminal domain-only fragment (CIBN, amino acids 1-170)°. 
The efficacy of each design is determined based on the level of light-dependent 
upregulation of the endogenous target Neurog2 mRNA (Fig. 1b). To use AAV asa 
vector for the delivery of LITE components, we needed to ensure that the total viral 
genome size of each recombinant AAV, with the LITE transgenes included, did 
not exceed the packaging limit of 4.8 kilobases*®. We shortened the TALE N and 
carboxy termini (keeping 136 amino acids in the N terminus and 63 in the C ter- 
minus) and exchanged the CRY2PHR (1.5 kb) and CIB1 (1 kb) domains (TALE- 
CIB1 and CRY2PHR-VP64; Fig. 3a). TALE binding sequences were selected based 
on DNase I-sensitive regions in the promoter of each target gene. TALE targeting 
sequences are listed in Supplementary Table 6. 

Neuro 2a culture and experiments. Neuro 2a cells (Sigma-Aldrich) were grown 
in media containing a 1:1 ratio of OptiMEM (Life Technologies) to high-glucose 
DMEM with GlutaMax and sodium pyruvate (Life Technologies) supplemented 
with 5% HyClone heat-inactivated FBS (Thermo Scientific), 1% penicillin/strep- 
tomycin (Life Technologies), and passaged at 1:5 every 2 days. 120,000 cells were pla- 
ted in each well of a 24-well plate 18-20 h before transfection. 1 h before transfection, 
media was changed to DMEM supplemented with 5% HyClone heat-inactivated 
FBS and 1% penicillin/streptomycin. Cells were transfected with 1.0 1g total of 
construct DNA (at equimolar ratios) per well with 1.5 ul of GenJet (SignaGen 
Laboratories) transfection reagent according to the manufacturer’s instructions. 
Media was exchanged 24h and 44h post-transfection and light stimulation was 
started at 48 h. Stimulation parameters were: 5 mW cm ”, 466 nm, 7% duty cycle 
(1 slight pulse 0.067 Hz) for 12 h unless indicated otherwise in figure legends. RNA 
was extracted using the RNeasy kit (Qiagen) or NucleoSpin RNA kit (Macherey- 
Nagel) according to manufacturer’s instructions and 1 jig of RNA per sample was 
reverse-transcribed using qScript (Quanta Biosystems). Relative mRNA levels 
were measured by reverse transcription and quantitative PCR (qRT-PCR) using 
TaqMan probes specific for the targeted gene as well as GAPDH as an endogenous 
control (Life Technologies, see Supplementary Table 7 for TaqMan probe IDs). 
AAC, analysis was used to obtain fold-changes relative to negative controls trans- 
duced with GFP only and subjected to light stimulation. Toxicity experiments 
were conducted using the LIVE/DEAD assay kit (Life Technologies) according to 
manufacturer’s protocol. 

AAV vector production. The ssDNA-based genome of AAV is less susceptible to 
recombination, thus providing an advantage over RNA-based lentiviral vectors” 
for the packaging and delivery of highly repetitive TALE sequences. 293FT cells 
(Life Technologies) were grown in antibiotic-free D10 media (DMEM high glu- 
cose with GlutaMax and sodium pyruvate, 10% heat-inactivated HyClone FBS, 
and 1% 1 M HEPES) and passaged daily at 1:2-2.5. The total number of passages 
was kept below 10 and cells were never grown beyond 85% confluence. The day 
before transfection, 10” cells in 21.5 ml of D10 media were plated onto 15-cm 
dishes and incubated for 18-22 h or until ~80% confluence. For use as a transfec- 
tion reagent, 1 mg ml ' of PEI “Max” (Polysciences) was dissolved in water and 
the pH of the solution was adjusted to 7.1. For AAV production, 10.4 1g of pDF6 
helper plasmid, 8.7 j1g of pAAV1 serotype packaging vector, and 5.2 ug of pAAV 
vector carrying the gene of interest were added to 434 il of serum-free DMEM and 
130 pl of PEI “Max” solution was added to the DMEM-diluted DNA mixture. The 
DNA/DMEM/PEI cocktail was vortexed and incubated at room temperature for 
15 min. After incubation, the transfection mixture was added to 22 ml of complete 
media, vortexed briefly, and used to replace the media for a 15-cm dish of 293FT 
cells. For supernatant production, transfection supernatant was collected at 48 h, 
filtered through a 0.45 1m PVDF filter (Millipore), distributed into aliquots, and 
frozen for storage at —80 °C. 

Primary cortical neuron culture. Dissociated cortical neurons were prepared 
from C57BL/6N mouse embryos on E16 (Charles River Labs). Cortical tissue was 
dissected in ice-cold HBSS (50 ml 10 HBSS, 435 ml dH,0, 0.3 M HEPES pH 7.3, 
and 1% penicillin/streptomycin). Cortical tissue was washed 3 times with 20 ml of 
ice-cold HBSS and then digested at 37 °C for 20 min in 8 ml of HBSS with 240 ul of 
2.5% trypsin (Life Technologies). Cortices were then washed 3 times with 20 ml of 
warm HBSS containing 1 ml FBS. Cortices were gently triturated in 2 ml of HBSS 
and plated at 150,000 cells per well in poly-p-lysine-coated 24-well plates (BD 
Biosciences). Neurons were maintained in Neurobasal media (Life Technologies), 
supplemented with 1X B27 (Life Technologies), GlutaMax (Life Technologies) 
and 1% penicillin/streptomycin. 

Primary neuron transduction and light stimulation experiments. Primary 
cortical neurons were transduced with 250 pl of AAV1 supernatant on DIV 5 (DIV, 
days in vitro). The media and supernatant were replaced with regular complete 
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Neurobasal the following day. Neurobasal was exchanged with Minimal Essential 
Medium (Life Technologies) containing 1 B27, GlutaMax (Life Technologies) 
and 1% penicillin/streptomycin 6 days after AAV transduction to prevent forma- 
tion of phototoxic products from HEPES and riboflavin contained in Neurobasal 
during light stimulation. For co-transduction of primary neurons with two AAV 
vectors, the co-delivery efficiency is >80%, with individual components having 
transduction efficiencies between 83-92%. 

Light stimulation was started 6 days after AAV transduction (DIV 11) with an 
intensity of 5mW cm ”, duty cycle of 0.8% (250 ms pulses at 0.033 Hz or 500 ms 
pulses at 0.016 Hz), 466 nm blue light for 24h unless indicated otherwise in figure 
legends. RNA extraction and reverse transcription were performed using the Cells- 
to-Ct kit according to the manufacturer’s instructions (Life Technologies). Rela- 
tive mRNA levels were measured by reverse transcription and quantitative PCR 
(qRT-PCR) using TaqMan probes as described above for Neuro 2a cells. 
Immunohistochemistry of primary neurons. For immunohistochemistry of 
primary neurons, cells were plated on poly-p-lysine/laminin coated coverslips (BD 
Biosciences) after harvesting. AAV 1-transductions were performed as described 
above. Neurons were fixed 7 days post-transduction with 4% paraformaldehyde 
(Sigma Aldrich) for 15 min at RT. Blocking and permeabilization were performed 
with 10% normal goat serum (Life Technologies) and 0.5% Triton-X100 (Sigma- 
Aldrich) in DPBS (Life Technologies) for 1 h at room temperature. Neurons were 
incubated with primary antibodies overnight at 4 °C, washed 3 with DPBS and 
incubated with secondary antibodies for 90 min at room temperature. For anti- 
body providers and concentrations used, see Supplementary Table 8. Coverslips 
were finally mounted using Prolong Gold Antifade Reagent with DAPI (Life Tech- 
nologies) and imaged on an Axio Scope A.1 (Zeiss) with an X-Cite 120Q light 
source (Lumen Dynamics). Images were acquired using an AxioCam MRm cam- 
era and AxioVision 4.8.2. 

Western blots. For preparation of total protein lysates, primary cortical neurons 
were collected after light stimulation (see above) in ice-cold lysis buffer (RIPA, Cell 
Signaling; 0.1% SDS, Sigma-Aldrich; and cOmplete ULTRA protease inhibitor 
mix, Roche Applied Science). Cell lysates were sonicated for 5 min at ‘M’ setting 
with the Bioruptor water bath sonicator (Diagenode) and centrifuged at 21,000g 
for 10 min at 4 °C. Protein concentration was determined using the RC DC protein 
assay (Bio-Rad). 30-40 1g of total protein per lane was separated under non- 
reducing conditions on 4-15% Tris-HCl gels (Bio-Rad) along with Precision 
Plus Protein Dual Color Standard (Bio-Rad) After wet electrotransfer to polyvi- 
nylidene difluoride membranes (Millipore) and membrane blocking for 45 min in 
5% BLOT-QuickBlocker (Millipore) in Tris-buffered saline (TBS, Bio-Rad), west- 
ern blots were probed with anti-mGluR2 (Abcam, 1:1,000) and anti--tubulin 
(Sigma-Aldrich 1:20,000) overnight at 4 °C, followed by washing and anti-mouse- 
IgG HRP antibody incubation (Sigma-Aldrich, 1:5,000-1:10,000). For further anti- 
body details see Supplementary Table 8. Detection was performed via ECL western 
blot substrate (SuperSignal West Femto Kit, Thermo Scientific). Blots were imaged 
with an Alphalmager system (Innotech), and quantified using Image] software 1.46r. 
Production of concentrated and purified AAV1/2 vectors. Production of con- 
centrated and purified AAV for stereotactic injection in vivo was performed using 
the same initial steps outlined above for production of AAV1 supernatant. How- 
ever, for transfection, equal ratios of AAV1 and AAV2 serotype plasmids were 
used instead of AAV] alone. Five 15-cm plates were transfected per construct and 
cells were collected with a cell-scraper 48 h post transfection. Purification of AAV1/2 
particles was performed using HiTrap heparin affinity columns (GE Healthcare)”’. 
We added a second concentration step down to a final volume of 100 pl per con- 
struct using an Amicon 500 ul concentration column (100 kDa cutoff, Millipore) 
to achieve higher viral titres. Titration of AAV was performed by qRT-PCR using 
a custom TaqMan probe for WPRE (woodchuck hepatitis post-transcriptional 
response element; Life Technologies). Prior to dRT-PCR, concentrated AAV was 
treated with DNase I (New England Biolabs) to achieve a measurement of DNase 
I-resistant particles only. Following DNase I heat-inactivation, the viral envelope 
was degraded by proteinase K digestion (New England Biolabs). Viral titre was 
calculated based on a standard curve with known WPRE copy numbers. 

Stereotactic injection of AAV1/2 and optical implant. All animal procedures 
were approved by the MIT Committee on Animal Care. Adult (10-14 weeks old) 
male C57BL/6N mice were anaesthetized by intraperitoneal (i.p.) injection of 
ketamine/xylazine (100 mgkg  ' ketamineand 10 mg kg’ xylazine) and pre-emptive 
analgesia was applied (Buprenex, 1 mg kg" ‘, i.p.). Craniotomy was performed accor- 
ding to approved procedures and 1 jl of AAV1/2 was injected into ILC at 0.35/ 
1.94/—2.94 (lateral, anterior and inferior coordinates in mm relative to bregma). 
During the same surgical procedure, an optical cannula with fibre (Doric Lenses) 
was implanted into ILC unilaterally with the end of the optical fibre located at 
0.35/1.94/—2.74 relative to bregma. The cannula was affixed to the skull using Meta- 
bond dental cement (Parkell Inc.) and Jet denture repair (Lang Dental) to build a 
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stable, supporting cone. The incision was sutured and proper post-operative ana- 
Igesics were administered for 3 days following surgery. 

Immunohistochemistry on ILC brain sections. Mice were injected with a lethal 
dose of ketamine/xylazine anaesthetic and transcardially perfused with PBS and 
4% paraformaldehyde (PFA). Brains were additionally fixed in 4% PFA at 4°C 
overnight and then transferred to 30% sucrose for cryoprotection overnight at 
room temperature. Brains were then transferred into Tissue-Tek Optimal Cutting 
Temperature (OCT) Compound (Sakura Finetek) and frozen at —80°C. 18-11m 
sections were cut on a cryostat (Leica Biosystems) and mounted on Superfrost Plus 
glass slides (Thermo Fischer). Sections were post-fixed with 4% PFA for 15 min, and 
immunohistochemistry was performed as described for primary neurons above. 
Light stimulation and mRNA level analysis in ILC. Neurons at the injection site 
were efficiently co-transduced by both viruses, with >80% of transduced cells 
expressing both TALE(Grm2)-CIB1 and CRY2PHR-VP64 (Fig. 3g and Extended 
Data Fig. 5b). 8 days post-surgery, awake and freely moving mice were stimulated 
using a 473 nm laser source (OEM Laser Systems) connected to the optical implant 
via fibre patch cables and a rotary joint. Stimulation parameters were identical to 
those used on primary neurons: 5 mW (total output), 0.8% duty cycle (500 ms light 
pulses at 0.016 Hz) for a total of 12h. Brain tissue from the fibre optic cannula 
implantation site was analysed (Fig. 3h) for changes in Grm2 mRNA. Experimental 
conditions, including transduced constructs and light stimulation are listed in 
Supplementary Table 9. 

After the end of light stimulations, mice were euthanized using CO, and the 
prefrontal cortices (PFC) were quickly dissected on ice and incubated in RNA later 
(Qiagen) at 4°C overnight. 200 um sections were cut in RNA later at 4°C ona 
vibratome (Leica Biosystems). Sections were then frozen on a glass cover slide on 
dry ice and virally transduced ILC was identified under a fluorescent stereomicro- 
scope (Leica M165 FC). A 0.35-mm diameter punch of ILC, located directly ven- 
trally to the termination of the optical fibre tract, was extracted (Harris uni-core, 
Ted Pella). The brain punch sample was then homogenized using an RNase-free 


pellet-pestle grinder (Kimble Chase) in 50 ll Cells-to-Ct RNA lysis buffer and RNA 
extraction, reverse transcription and qRT-PCR was performed as described for 
primary neuron samples. 

Chromatin immunoprecipitation (ChIP). Neurons or Neuro 2a cells were cul- 
tured and transduced or transfected as described above. ChIP samples were pre- 
pared as previously described** with minor adjustments for the cell number and 
cell type. Cells were collected in 24-well format, washed in 96-well format, and 
transferred to microcentrifuge tubes for lysis. Sample cells were directly lysed by 
water bath sonication with the Biorupter sonication device for 21 min using 30s 
on/off cycles and 4 °C chilled circulation (Diagenode). qRT-PCR was used to assess 
enrichment of histone marks at the targeted locus. RT-PCR primer sequences 
are listed in Supplementary Table 10. 

Statistical analysis. All experiments were performed with a minimum of two 
independent biological replicates. Statistical analysis was performed with Prism 
(GraphPad) using Student’s two-tailed t-test when comparing two conditions, 
ANOVA with Tukey’s post-hoc analysis when comparing multiple samples with 
each other, and ANOVA with Dunnett’s post-hoc analysis when comparing mul- 
tiple samples to the negative control. 
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Extended Data Figure 1 | RNA-guided DNA binding protein Cas9 can be 
used to target transcription effector domains to specific genomic loci. 

a, The RNA-guided nuclease Cas9 from the type II Streptococcus pyogenes 
CRISPR/Cas system can be converted into a nucleolytically inactive RNA- 
guided DNA binding protein (Cas9**) by introducing two alanine 
substitutions (D10A and H840A). Schematic showing that a synthetic guide 
RNA (sgRNA) can direct Cas9**-effector fusion to a specific locus in the 
human genome. The sgRNA contains a 20-bp guide sequence at the 5’ end 
which specifies the target sequence. On the target genomic DNA, the 20-bp 


target site needs to be followed by a 5’-NGG PAM motif. b, c, Schematics 
showing the sgRNA target sites in the human KLF4 and SOX2 loci, respectively. 
Each target site is indicated by the blue bar and the corresponding PAM 
sequence is indicated by the magenta bar. d, e, Schematics of the Cas9**-VP64 
transcription activator and SID4X—Cas9** transcription repressor constructs. 
f, g, Cas9**-VP64- and SID4X-Cas9**-mediated activation of KLF4 and 
repression of SOX2, respectively. All mRNA levels were measured relative to 
GFP mock-transfected 293FT cells (mean + s.e.m.; n = 3 biological replicates). 
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Extended Data Figure 2 | Engineering of light stimulation parameters and 
activation domains of LITEs. a, Illustration of the absorption spectrum of 
CRY2 in vitro. Cryptochrome 2 was optimally activated by 350-475 nm light’. 
A sharp drop in absorption and activation was seen for wavelengths greater 
than 480 nm. Spectrum was adapted from ref. 23. b, Impact of illumination 
duty cycle on LITE-mediated gene expression. Varying duty cycles 
(illumination as percentage of total time) were used to stimulate 293FT cells 
expressing LITEs targeting the KLF4 gene, to investigate the effect of duty cycle 
on LITE activity. KLF4 expression levels were compared to cells expressing 
GFP only. Stimulation parameters were: 466 nm, 5 mW cm ” for 24h. Pulses 
were performed at 0.067 Hz with the following durations: 1.7% = 0.25 s pulse, 
7% = 1s pulse, 27% = 4s pulse, 100% = constant illumination. 

(mean = s.e.m.; n = 3-4 biological replicates.) c, The transcriptional activity of 
CRY2PHR/ CIB1 LITE was found to vary according to the intensity of 466 nm 
blue light. Neuro 2a cells were stimulated for 12 h at a 7% duty cycle (1s pulses 
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at 0.067 Hz). All Neurog2 mRNA levels were measured relative to cells 
expressing GFP only (mean + s.e.m.; n = 3-4 biological replicates). d, Light- 
induced toxicity measured as the percentage of cells positive for red-fluorescent 
ethidium homodimer-1 versus calcein-positive cells (mean + s.e.m.; 1 = 3 
biological replicates; **P < 0.01). e, We compared the activation domains 
VP16 and p65 in addition to VP64 to test the modularity of the LITE CIB1- 
effector component. Neurog2 upregulation with and without light by LITEs 
using different transcriptional activation domains (VP16, VP64 and p65). 
Neuro 2a cells transfected with LITE were stimulated for 24 h with 466 nm light 
at an intensity of 5mW cm * anda duty cycle of 7% (1 s pulses at 0.067 Hz). All 
three domains produced a significant light-dependent Neurog2 mRNA 
upregulation (P < 0.001). We selected VP64 for subsequent experiments due 
to its lower basal activity in the absence of light-stimulation (mean + s.e.m.; 
n = 3-4 biological replicates). 
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Extended Data Figure 3 | Chemical induction of endogenous gene 
transcription. a, Schematic showing the design of a chemical inducible 
two-hybrid TALE system based on the abscisic acid (ABA) receptor system. 
ABI and PYL dimerize upon the addition of ABA and dissociate when ABA is 
withdrawn. b, Time-course of ABA-dependent Neurog2 upregulation. 250 1M 
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of ABA was added to Neuro 2a cells expressing TALE(Neurog2)-ABI and 
PYL-VP64. Fold mRNA increase was measured at the indicated time points 
after the addition of ABA. c, Decrease of Neurog2 mRNA levels after 24h of 
ABA stimulation. All Neurog2 mRNA levels were measured relative to GFP- 
expressing control cells (mean + s.e.m.; n = 3-4 biological replicates). 
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Extended Data Figure 4 | Efficient AAV production using cell supernatant. 
a, Lentiviral and AAV vectors carrying GFP were used to test transduction 
efficiency. b, Primary cortical neurons were transduced with 300 and 250 ul 
supernatant derived from the same number of lentivirus- or AAV-transduced 
293FT cells. Representative images of GFP expression were collected at 7 days 
post infection. Scale bars, 50 |um. c, The depicted process was developed for the 
production of AAV supernatant and subsequent transduction of primary 
neurons. 293FT cells were transfected with an AAV vector carrying the gene of 


add supernatant to 
primary cortical neurons 


interest, the AAV1 serotype packaging vector (pAAV1), and helper plasmid 
(pDF6) using PEI. 48 h later, the supernatant was collected and filtered through 
a 0.45 tm PVDF membrane. Primary neurons were then transduced with 
supernatant and remaining aliquots were stored at —80 °C. Stable levels of 
AAV construct expression were reached after 5-6 days. AAV supernatant 
production following this process can be used for production of up to 96 
different viral constructs in 96-well format (used for TALE screen in neurons 
shown in Fig. 2c). 
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Extended Data Figure 5 | Characterizing LITEs in neurons and in vivo. LITE components by AAV1/2 in vivo in mouse infralimbic cortex. Cells 
a, Impact of light duty cycle on primary neuron health. The effect of light transduced by TALE(Grm2)-CIB1 alone, CRY2PHR-VP64 alone, or 
stimulation on primary cortical neuron health was compared for duty cycles of _ co-transduced were calculated as a percentage of all transduced cells 
7%, 0.8%, and no light conditions. Calcein was used to evaluate neuron (mean + s.e.m.; 1 = 9 fields from 3 animals). c, Grm2 mRNA levels were 
viability. Bright-field images show cell morphology and integrity. Primary determined in primary neurons transfected with individual LITE components. 
cortical neurons were stimulated with the indicated duty cycle for 24h with Primary neurons expressing TALE(Grm2)-CIB1 alone led to a similar increase 


5mWcm ~ of 466 nm light. Representative images, scale bar, 50 jum. Pulses in Grm2 mRNA levels as unstimulated cells expressing the complete LITE 
were performed in the following manner: 7% duty cycle = 1s pulse at 0.067 Hz, system (mean = s.e.m.; n = 3-4 biological replicates). 
0.8% duty cycle = 0.5s pulse at 0.0167 Hz. b, Co-transduction efficiency of 
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Extended Data Figure 6 | Effects of LITE component engineering on 
activation, background signal and fold induction. Protein modifications 
were used to find LITE components resulting in reduced background 
transcriptional activation while improving induction ratio by light. In brief, 
nuclear localization signals and mutations in an endogenous nuclear export 
signal were used to improve nuclear import of the CRY2PHR-VP64 
component. Several variations of CIB1 intended to either reduce nuclear 
localization or CIB1 transcriptional activation were pursued to reduce the 
contribution of the TALE-CIB1 component to background activity. The 


er 


eee eee oe 2-8 oe eo 


1 


i il ) 


TT 
| 


5 10 15 20 25 100 200 


replicates). 


results of all tested combinations of CRY2PHR-VP64 and TALE-CIB1 are 
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shown above. The table to the left of the bar graphs indicates the particular 
combination of domains/mutations used for each condition. Each row of the 
table and bar graphs contains the component details, light/no light activity, and 
induction ratio by light for the particular CRY2PHR/CIB1 combination. 
Combinations that resulted in both decreased background and increased fold 
induction compared to LITE1.0 are highlighted in green in the table column 
marked ‘+’ (t-test P< 0.05). See Supplementary Discussion for detailed 
explanation of each modification (mean + s.e.m.; n = 2-3 biological 
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Extended Data Figure 7 | Strategies for optimizing the LITE system. a, In 
the absence of light, the TALE-CIB1 LITE component resides in the cytoplasm 
due to the absence of a nuclear localization signal, NLS (or the addition of a 
nuclear export signal, NES). The CRY2PHR-VP64 component containing a 
NLS on the other hand is actively imported into the nucleus on its own. b, In 
the presence of blue light, TALE-CIB1 binds to CRY2PHR. The NLS present in 
CRY2PHR-VP64 now mediates nuclear import of the complex of both LITE 
components, enabling them to activate transcription at the targeted locus. In 
addition to the LITE2.0 constructs, several CRY2PHR-VP64/TALE-CIB1 
combinations from the engineered LITE component screen were of particular 
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note. LITE1.9.0, which combined the «-importin NLS effector construct with a 
mutated endogenous NLS and A276-307 TALE-CIB1 construct, exhibited an 
induction ratio greater than 9 and an absolute light activation of more than 180. 
LITE1.9.1, which combined the unmodified CRY2PHR-VP64 with a mutated 
NLS, A318-334, AD5 NES TALE-CIB1 construct, achieved an induction ratio 
of 4 with a background activation of 1.06. A selection of other LITE1.9 
combinations with background activations lower than 2 and induction ratios 
ranging from 7 to 12 were also highlighted (mean + s.e.m.; n = 2-3 biological 
replicates). 
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Extended Data Figure 8 | TALE SID4X repressor characterization and 
application in neurons. a, A synthetic repressor was constructed by 
concatenating 4 SID domains (SID4X). To identify the optimal TALE- 
repressor architecture, SID or SID4X was fused to a TALE designed to target 
the mouse p11 (also known as $100a10) gene. b, Fold decrease in p11 mRNA 
was assayed using qRT-PCR (mean ~ s.e.m.; = 3 biological replicates). 

c, General schematic of constitutive TALE transcriptional repressor packaged 
into AAV. Effector domain SID4X is highlighted. hSyn, human synapsin 


promoter; 2A, Thosea asigna virus 2A self-cleaving peptide*’; WPRE, 
woodchuck hepatitis post-transcriptional response element; bGH pA, bovine 
growth hormone poly-A signal. phiLOV2.1** (330 bp) was chosen as a shorter 
fluorescent marker to ensure efficient AAV packaging. d, A TALE targeting 
either the endogenous mouse locus Grm5 or Grm2 was fused to SID4X and 
virally transduced into primary neurons. SID4X-mediated target gene 
downregulation is shown for each TALE relative to levels in control neurons 
expressing GFP only (mean + s.e.m.; n = 3-4 biological replicates). 
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Extended Data Figure 9 | A diverse set of epiT ALEs mediate transcriptional —_ transduced with GFP only. (*P < 0.05; mean + s.e.m.; n = 2-3 biological 


repression in neurons and Neuro 2a cells. a, 24 different histone effector replicates.) b, A total of 32 epi ALEs were transfected into Neuro2A cells. 20 of 
domains were each fused to a Grm2 targeting TALE. TALE-effector fusions them mediated significant repression of the targeted Neurog2 locus (*P < 0.05; 
were expressed in primary cortical mouse neurons using AAV transduction. mean + s.e.m.; n = 2-3 biological replicates). 


Grm2 mRNA levels were measured using qRT-PCR relative to neurons 
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along with histone modifications in Neuro 2a cells. a, TALEs fused to methyltransferase binding activity was fused to a TALE targeting Neurog2. 
histone-deacetylating epigenetic effectors NcoR and SIRT3 targeting the Repression of Neurog2 mRNA levels was observed (mean + s.e.m.; n = 2-3 
murine Neurog2 locus in Neuro 2a cells were assayed for repressive activityon _ biological replicates). d, ChIP qRT-PCR showing an increase in H3K27me3 
Neurog2 transcript levels (mean + s.e.m.; n = 2-3 biological replicates). levels at the Neurog2 promoter for the PHF19 epiT ALE (mean = s.e.m.; 


b, ChIP qRT-PCR showing a reduction in H3K9 acetylation at the Neurog2 n = 2-3 biological replicates). 
promoter for NcoR and SIRT3 epiTALEs (mean + s.e.m.; n = 2-3 biological 
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Charting a dynamic DNA methylation landscape of 


the human genome 
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DNA methylation is a defining feature of mammalian cellular iden- 
tity and is essential for normal development’. Most cell types, except 
germ cells and pre-implantation embryos**, display relatively 
stable DNA methylation patterns, with 70-80% of all CpGs being 
methylated*. Despite recent advances, we still have a limited under- 
standing of when, where and how many CpGs participate in genomic 
regulation. Here we report the in-depth analysis of 42 whole-genome 
bisulphite sequencing data sets across 30 diverse human cell and tis- 
sue types. We observe dynamic regulation for only 21.8% of autoso- 
mal CpGs within a normal developmental context, most of which are 
distal to transcription start sites. These dynamic CpGs co-localize with 
gene regulatory elements, particularly enhancers and transcription- 
factor-binding sites, which allow identification of key lineage-specific 
regulators. In addition, differentially methylated regions (DMRs) 
often contain single nucleotide polymorphisms associated with cell- 
type-related diseases as determined by genome-wide association 
studies. The results also highlight the general inefficiency of whole- 
genome bisulphite sequencing, as 70-80% of the sequencing reads 
across these data sets provided little or no relevant information about 
CpG methylation. To demonstrate further the utility of our DMR set, 
we use it to classify unknown samples and identify representative 
signature regions that recapitulate major DNA methylation dyna- 
mics. In summary, although in theory every CpG can change its methy- 
lation state, our results suggest that only a fraction does so as part 
of coordinated regulatory programs. Therefore, our selected DMRs 
can serve as a starting point to guide new, more effective reduced 
representation approaches to capture the most informative fraction 
of CpGs, as well as further pinpoint putative regulatory elements. 

Changes in DNA methylation patterns and the resulting DMRs have 
been the focus of numerous studies in the context of normal development’ 
and disease*. These studies have characterized many different DMR 
classes including partially methylated domains’, condition-specific’®, 
cell-type-specific”’*"* and tissue-specific DMRs’*"*, as well as DMRs 
arising in diseases such as cancer’*”*. Owing to the relatively small frac- 
tion of genomic CpGs assayed or small sample cohorts, the question of 
what fraction of genomic CpGs changes its methylation state in the con- 
text of normal development as well as their regulatory context remains 
underexplored. 

In this study, we systematically investigated the DNA methylation 
state of most human autosomal CpGs to determine those that show 
dynamic changes and hence may participate in genome regulation ina 
developmental context (dynamic CpGs). In total, we included 42 whole- 
genome bisulphite sequencing (WGBS) data sets, comprising a range 
of human cell and tissue types (n = 30). The combined 40.4 billion reads 
enabled us to assay 25.71 million autosomal CpGs (25 coverage in 
at least =50% of all samples; 96% of all hg19 autosomal CpGs) (Sup- 
plementary Table 1). We organized the samples into four classes: human 


embryonic stem (ES) cells and human ES-cell-derived cell populations, 
primary cells, disease conditions, and long-term cultured cell lines (Fig. la 
and Supplementary Table 1). On a global scale, human ES cells and 
their derivatives exhibit the highest DNA methylation levels, followed 
by primary cells (~5% less), which is in sharp contrast to the global 
hypomethylation observed in colon cancer (~ 10-15% less) and long- 
term cultured cell lines (10-30% less). 

Focusing initially on our developmental sample set (n = 24 total, ES 
cells, in-vitro-derived cell types and primary cells; Supplementary 
Table 1) we identified ~5.6 million dynamic CpGs (minimum methy- 
lation difference = 0.3, false discovery rate (FDR) = 10.4%, 21.8% of 
captured autosomal CpGs; Fig. 1b, Supplementary Fig. le and Sup- 
plementary Information) distributed across 716,087 discrete DMRs 
(19.2% of the mappable human genome; Supplementary Table 2). In 
addition to this moderately stringent cut-off, we also tested thresholds 
as low as 10% methylation difference that may account for DNA 
methylation changes arising from relevant small subpopulations in 
heterogeneous tissue samples or noise, but still only find 10.4 million 
CpGs to be dynamic (Supplementary Fig. 1a—d). 

Focusing on the more stringent set (0.3 difference), we find appro- 
ximately 70% are on average highly methylated (>75% methylation 
ratio), whereas less than 2% are on average unmethylated (<10% methy- 
lation ratio) (Supplementary Fig. 1h). In line with this observation, we 
find that hypomethylation of DMRs shows greater sample specificity 
than hypermethylation (Fig. 1c). Interestingly, most of the DMRs are 
small (>75% are smaller than 1 kilobase (kb); Supplementary Fig. 1i) 
and located distal to transcription start sites (Supplementary Fig. 1)). 
However, the average variation in DNA methylation levels across all 
RefSeq promoters (n = 30,090) does still exhibit a clear increase speci- 
fically at the transcription start sites, with most of this variation occur- 
ring at intermediate and low CpG density promoters (Fig. 1d). For CpG 
islands in general, we observe distinct dynamic regimes, highlighting 
that different classes of CpG islands are probably subject to different 
modes of regulation'*'”’* (Fig. 1d, bottom). Consistent with previous 
reports’®, we find CpG island shores (regions within 2 kb ofan island)’* 
to be among the most variable genomic regions (Supplementary Fig. 1o). 
These observations are exemplified at the OCT4 (also known as POUSF1) 
locus, in which the promoter and large parts of the gene body exhibit 
high DNA methylation dynamics, whereas the strong downstream CpG 
island as well as the surrounding CTCF-binding sites remain static 
(Fig. le). Only 12.2% of our DMR set overlap with at least one of four 
annotated classic, gene-centric genomic features (promoter, exon, CpG 
island (CGI), or CGI-shore; n = 568,430) (Fig. 1f). To gain insights into 
the role of the remaining set, we first investigated their co-localization 
with DNase I hypersensitive sites across 92 distinct cell types’ as well as 
a catalogue of putative enhancer elements for 31 cell and tissue types”’. 
Notably, we found that 42.3% of our DMRs overlap with at least one 
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Figure 1 | Identification and characteristics of DMRs in the human genome. 
a, Principal component (PC) analysis based on CpG methylation levels for 1-kb 
tiles across 30 diverse human cell and tissue samples. Colouring indicates 
classification of samples into subgroups and group-wise mean DNA methylation. 
Detailed sample annotations are listed in Supplementary Table 1. Grey area 
indicates Alzheimer’s disease (AD) samples. b, Density scatterplot of CpG-wise 
DNA methylation level differences (x axis, P= 0.01) and CpG median methylation 
(y axis) across the 24 developmental samples (excluding cancer and long-term 
culture). Colouring indicates CpG density from low (blue) to high (red). The red 
box highlights dynamic CpGs (=0.3). ¢, Cumulative distribution of DMR 
specificity. High hypo/hypermethylation specificity indicates that a particular 
region is methylated/unmethylated in most tissues and deviates from this default 
state in only one or a few cases. d, Top, composite plot of mean DNA methylation 
differences across various genomic features. Black lines indicate the median of the 
average DNA methylation difference across each feature. Grey areas mark twenty- 
fifth and seventy-fifth percentiles. Bottom, distribution of mean DNA methylation 


DNase I hypersensitive site (Fig. 1f), and 26.1% co-localize with enhancer- 
like regions, which cover more than 50% of all H3K27ac regions in our 
catalogue (n = 285,344) and represents one of the most differentially 
methylated features (Fig. 1d). Next, we examined DMR overlap with 
transcription-factor-binding site (TFBS) clusters compiled from 165 
transcription factors profiled by the ENCODE project”! and uncovered 
a highly significant overlap of the two feature classes (odds ratio 1.14, 
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difference for each genomic feature. Black bars indicate twenty-fifth and seventy- 
fifth percentiles; white dots mark the median. For CGI islands, a smaller, 
experimentally determined set (eCGI; n = 25,490) is also shown. Promoters are 
broken down into high CpG content (HCP; n = 24,899), intermediate CpG 
content (ICP; n = 10,920) and low CpG content (LCP; n = 7,946) regions 

(n = 43,765 total). Shore denotes regions within 2 kb of an island; eShore denotes 
experimentally determined shore. pEnhancer, putative enhancer. e, Methylation 
level variation across the OCT4 locus (chr6: 31,119,000-31,162,000) (top). Blue bars 
indicate significant DMRs at P = 0.01, and exhibit a minimum difference = 0.3 
across the 24 developmental samples. Grey boxes (1-3) are examples of regions that 
are static (1 and 2) or that do not meet the threshold of dynamic (3). For reference, 
ENCODE TFBS cluster track, DNase I hypersensitive sites, CpG islands and RefSeq 
genes are shown. DNAme, DNA methylation. f, Distribution of DMRs across 
various genomic features. Each region is assigned to only one of these genomic 
features according to the ranking promoter, CGI, CGI shore, 5’ exon, exon, intron, 
putative enhancers, DNase I hypersensitive site (DHS) or other. 


P<0.01 empirical test, Supplementary Information). Interestingly, we 
find that more than 50% of all DMRs overlap with at least one and 25% 
with more than three TFBSs, accounting for an additional 13.0% of 
DMRs (Fig. 2a). Consistent with this, we find markedly increased vari- 
ation in DNA methylation levels specifically across TFBSs (Supplemen- 
tary Fig. 2a). In summary, we were able to attribute 64.2% of all DMRs 
to at least one putative gene regulatory element or coding sequence 
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(Supplementary Fig. le-h), suggesting that they demarcate various 
classes of regulatory elements. 

We determined all cell-type-specific hypomethylated regions (n = 430,250; 
see Supplementary Information) and investigated the enrichment for 
161 ENCODE factors (excluding MBD4, SETDB1, POL2P and HDAC2 
from the previous set). Notably, we observe significant enrichment of 
cell-type-specific transcription factors that are known to be involved 
in the regulation of the respective cellular states (Fig. 2b). For instance, 
the top three factors enriched in HUES64-specific DMRs are OCT4, 
SOX2 and NANOG (Fig. 2b). Similarly, PU.1 and TAL] are highly enri- 
ched in CD34 cells and hepatocyte nuclear factors in adult liver (Fig. 2b). 
In further support of this, motif enrichment analysis revealed many 
more interesting cell-type-specific transcription factor associations, such 
as enrichment of distinct NKX factors in fetal heart and brain, and 
ESRRG in fetal adrenal cells (Supplementary Fig. 2b and Supplemen- 
tary Table 3). Moreover, we tested whether the DMR set can be used to 
gain insights into the combinatorial control of cellular states by trans- 
cription factors. To that end, we determined all unmethylated (<10% 
methylation) PAX5 motif instances (+100 base pairs (bp)) across the 
human genome in CD34 or fetal brain cells (Fig. 2c). Although both 
footprint sets show a large overlap (11,031 sites), regions exclusively 
unmethylated in CD34 or fetal brain are enriched for distinct sets of 
other known lineage-specific transcription factor motifs; such as PU.1 
in CD34 and LMX1A or EN1 in fetal brain (Fig. 2c). Taken together, 
these findings highlight that cell-type-specific DNA methylation pat- 
terns can be used to detect footprints and infer potential co-regulation 
by transcription factors. In fact, more than 60% of all ENCODE TFBSs 
are hypermethylated in most samples, but become hypomethylated 
very specifically in only one or two cell types (Fig. 2d), whereas 25% are 
constitutively unmethylated and never change (Fig. 2d). 

Breaking down this distribution of TFBSs reveals distinct patterns 
of variation for different types of transcription factor (Supplementary 
Fig. 2e). More generally, we find that DNA methylation variation across 
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TFBSs is strongly correlated with its median methylation level and there- 
fore the (hypo-) methylation specificity (Supplementary Fig. 2c), as well as 
the tissue specificity of transcription factor expression” (Supplementary 
Fig. 2d). These observations support the notion” that selective transcrip- 
tion factor binding creates and/or maintains spatially highly constrained 
hypomethylated regions and confers cell type specificity. 

On the basis of these findings and previous reports”, we asked whe- 
ther DMRs are more susceptible to point mutations that are functionally 
consequential. Even with strict filtering criteria, we found a significant 
enrichment of single nucleotide polymorphisms (SNPs) in DMRs com- 
pared to genomic background as well as different sets of random con- 
trol regions (odds ratio 1.06, P< 107 16 binomial test; Supplementary 
Information). We then determined the overlap of DMRs with recently 
evolved human-specific CpGs, termed CpG beacons”, which shows a 
marked enrichment (odds ratio 1.37—1.6 compared to genomic back- 
ground and random control regions, P< 10 '°). This suggests overall 
higher genetic intra-species variability specifically at regions that change 
their DNA methylation state. In concordance with the increased SNP 
frequency, DMRs are also significantly enriched for genome-wide asso- 
ciation study (GWAS) SNPs from the GWAS catalogue”® (odds ratio 
1.16, P= 3.27 X 10 '°, binomial test). Similar to our observations on 
TFBSs, GWAS SNPs exhibit a non-random enrichment distribution 
across cell-type-specific DMRs (Fig. 3). For instance, we find DMRs 
specific to adult liver to be enriched for liver and serum metabolite- 
related GWAS SNPs, fetal heart DMRs enriched for cardiovascular- 
disease SNPs, and many of our blood-cell-type DMRs enriched for 
autoimmune disease and haematological parameter related SNPs. 

It is well known that many cancers exhibit considerable DNA methy- 
lation changes”’, we therefore compared a colon cancer to a matched 
control and found 532,665 differentially methylated CpGs. Forty per 
cent of these overlapped with the previously identified developmental 
dynamic set (Fig. 4a). Similarly, 37% of differentially methylated CpGs 
found in Alzheimer’s disease samples compared to normal controls 
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Figure 2 | Dynamic CpG methylation regions frequently co-localize with 
TEBSs. a, Overlap of DMRs with ENCODE TFBSs. b, Enrichment of the top 
four TFBSs significantly overrepresented (P < 0.01, empirical test) in DMRs 
specific to the cell type indicated (specificity > 0.15). Colour code quantifies 
median enrichment odds ratio compared to size-matched random control 
regions. Cmucosa, colon mucosa; F denotes fetal. c, Overlap of PAX5 motifs 
(+£100-bp; top) unmethylated in CD34 cells or fetal brain across the entire 
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human genome. Regions specifically unmethylated in CD34 or fetal brain were 
subjected to motif analysis, and top differentially co-occurring motifs are 
highlighted on the left for CD34 and on the right for fetal brain. d, Density 
scatterplot of maximum DNA methylation difference across 24 developmental 
samples for TFBS cluster track (n = 2.7 million) and median methylation level 


across all samples. Colour code indicates density of TFBSs from low (blue) to 
high (red). 
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Figure 3 | DMRs exhibit increased SNP frequency and show non-random 
GWAS SNP enrichment. Odds ratio of significantly overrepresented 

(P< 0.05 (except for Snigra; *P < 0.1), empirical test, see Supplementary 
Information) GWAS SNPs grouped into 16 categories in regions specifically 
hypomethylated within the sample indicated on the left. 


(n = 12,408) overlapped with our previous set of developmental CpGs 
(4,540 out of 12,294). The most notable change in the number of 
dynamic CpGs occurs when comparing our developmental sample 
cohort to the long-term cell culture cohort, leading to the identification 
of 8.4 million additional dynamic CpGs (Fig. 4b). Importantly, this 
expanded set differs notably in terms of their sequence features, with 
cancer and Alzheimer’s disease dynamic CpGs residing in less con- 
served regions that also exhibit lower motif complexity compared to 
the developmental and cell culture (Supplementary Fig. 4a, b). The cell- 
culture-specific CpGs exhibit increased repeat content relative to 
developmental CpGs, a feature that is shared with Alzheimer’s disease 
(Fig. 4c). Although the disease samples clearly add more dynamic 
CpGs, our analysis suggests a notable overlap with our previous set 
for CpGs that may participate in actual regulatory events. 

Finally, we investigated the utility and power of the reduced region 
set to classify accurately unknown samples or help to deconvolute a 
mixture of samples. We first clustered our developmental sample set 
based on the DMRs only (Fig. 4d) and found the result to be in excel- 
lent agreement with genome-wide 1-kb tiling-based clustering (Sup- 
plementary Fig. 5a). To probe the potential of our DMR set to classify 
unknown samples accurately, we derived signature region sets for diffe- 
rent sample groups. These signature regions turned out to be excellent 
classifiers of an unseen sample (Fig. 4e, fetal brain). Next, we tested as a 
proof of principle whether it is possible to use our DMR set to infer the 
different cell populations present within a heterogeneous sample. To 
that end, we deconvoluted an in silico mixture of HUES64 and hippo- 
campus WGBS libraries using our DNA methylation signatures. Nota- 
bly, the two top hits after application of a very simple deconvolution 
algorithm indeed proved to be hippocampus and HUES64 (Fig. 4f). 

Our study highlights and defines a relatively small subset of all geno- 
mic CpGs that change their DNA methylation state across a large num- 
ber of representative cell types. Although we expect that number to 
increase with more diverse cell types as more WGBS data sets becom- 
ing available, our analysis suggests that the rate of newly discovered 
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Figure 4 | Effective classification and sample deconvolution using only the 
DMR set. a, Overlap of dynamic CpGs (P < 0.01; |methylation difference| = 0.3) 
between the developmental and cancer data sets. The number of CpGs is shown in 
millions. b, Distribution of autosomal static and dynamic (across the three sets) 
CpGs. Class name indicates sample group in which a CpG was observed dynamic 
(developmental (n = 24), cell culture (n = 3), cancer (n = 2)) or remained 
unchanged over the entire sample set (n = 30). c, Repeat content distribution of 
DMRs (sets as in b). d, Hierarchical clustering using Pearson correlation coefficient 
(PCC) of the DMR values across the entire sample set (1 = 30). e, Distance of 
the fetal brain sample to different sets of signature regions defined for sample 
classes or individual samples, but excluding regions identified by means of the fetal 
brain sample. f, Contribution of individual sample signature region sets to an 
in-silico-generated hybrid sample (HUES64 and hippocampus). 


regulatory CpGs will drop rapidly once all major cell and tissue types 
have been mapped, mostly owed to the fact that between tissue varia- 
bility exceeds within tissue variability by one order of magnitude (Sup- 
plementary Fig. 3a, b). Future studies are likely to fine map dynamics 
occurring in more specific subpopulations, giving rise to smaller chan- 
ges in DNA methylation that we were unable to detect or include because 
of power constraints. Extreme conditions in vitro or in vivo suchas loss 
or misregulation of the maintenance methylation machinery will 
affect a larger subset including many intergenic CpGs that are gene- 
rally static, but most of these additional CpGs are unlikely to overlap 
with functional elements such as TFBSs or enhancers. In combination 
with the fact that sequencing of WGBS libraries is very inefficient, as 
about 65% of all 101-bp reads in our set did not even contain any CpGs 
to begin with, this amounts to an approximate, combined loss of around 
80% of sequencing depth on non-informative reads and static regions. 
Furthermore, once defined, it will probably be sufficient in most cases to 
profile only a representative subset of CpGs across a comprehensive set 
of DMRs using an array-based” or hybrid-capture-based” technology 
to recover representative dynamics and measure regulatory events. 
Using these results as a guiding principle, we expect further improved 
efficiencies in mapping DNA methylation and enhance its applicability 
as a marker for various regulatory dynamics in normal and disease 
phenotypes. 


METHODS SUMMARY 


Biological materials and sequencing libraries. Genomic DNA was fragmented to 
100-500 bp using a Covaris S2 sonicator. DNA fragments were cleaned-up, end- 
repaired, A-tailed and ligated with methylated paired-end adapters (purchased 
from ATDBio). See Supplementary Information for details. 

Data processing and analysis. In-house-generated WGBS libraries were aligned 
using MAQ” in bisulphite mode to the hg19/GRCh37 reference assembly. Subse- 
quently, CpG methylation calls were made using custom software, excluding duplicate, 
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low-quality reads, as well as reads with more than 10% mismatches. Methylation 
ratios of individual CpGs were described using a beta-binomial model estimating 
parameters from the number of methylated and total reads overlapping a given 
CpG, incorporating replicates. Only CpGs covered by =5 X reads were considered 
for further analysis. Differential methylation values of individual CpGs were esti- 
mated using the beta-difference distributions. CpG cluster differential methyla- 
tion was determined by pooling CpG level methylation differences using a random 
effects model. CpG cluster methylation specificity was determined using the Jensen- 
Shannon divergence of the methylation level distribution of a CpG cluster across all 
samples, anda reference distribution representing either of the two extremes: com- 
pletely unmethylated or fully methylated. In-silico-identified CpG islands were 
defined by genomic regions of at least 700 bp in length, a CpG observed versus expec- 
ted ratio of greater than 0.6, and a GC content greater than or equal to 0.5. For the 
SNP analysis, we obtained the CEPH SNP set from the University of California 
Santa Cruz (USCS). GWAS SNPs were retrieved from the GWAS catalogue, whereas 
most of the GWAS SNP grouping was taken from ref. 24. For TFBS analysis, we 
retrieved peak files from the ENCODE projects and collapsed replicates. For detailed 
methods see Supplementary Information. 
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DNA unwinding heterogeneity by RecBCD results 
from static molecules able to equilibrate 


Bian Liu?*, Ronald J. Baskin? & Stephen C. Kowalczykowski!? 


Single-molecule studies can overcome the complications of asyn- 
chrony and ensemble-averaging in bulk-phase measurements, pro- 
vide mechanistic insights into molecular activities, and reveal 
interesting variations between individual molecules’ *. The applica- 
tion of these techniques to the RecBCD helicase of Escherichia coli 
has resolved some long-standing discrepancies, and has provided 
otherwise unattainable mechanistic insights into its enzymatic 
behaviour**. Enigmatically, the DNA unwinding rates of indi- 
vidual enzyme molecules are seen to vary considerably**, but the 
origin of this heterogeneity remains unknown. Here we investigate 
the physical basis for this behaviour. Although any individual 
RecBCD molecule unwound DNA at a constant rate for an average 
of approximately 30,000 steps, we discover that transiently halting 
a single enzyme-DNA complex by depleting Mg*t-ATP could 
change the subsequent rates of DNA unwinding by that enzyme 
after reintroduction to ligand. The proportion of molecules that 
changed rate increased exponentially with the duration of the inter- 
ruption, with a half-life of approximately 1 second, suggesting that a 
conformational change occurred during the time that the molecule 
was arrested. The velocity after pausing an individual molecule 
was any velocity found in the starting distribution of the ensemble. 
We suggest that substrate binding stabilizes the enzyme in one of 
many equilibrium conformational sub-states that determine the 
rate-limiting translocation behaviour of each RecBCD molecule. 
Each stabilized sub-state can persist for the duration (approxi- 
mately 1 minute) of processive unwinding of a DNA molecule, com- 
prising tens of thousands of catalytic steps, each of which is much 
faster than the time needed for the conformational change required 
to alter kinetic behaviour. This ligand-dependent stabilization of 
rate-defining conformational sub-states results in seemingly static 
molecule-to-molecule variation in RecBCD helicase activity, but in 
fact reflects one microstate from the equilibrium ensemble that a 
single molecule manifests during an individual processive trans- 
location event. 

The RecBCD enzyme is an important helicase/nuclease in the repair of 
double-stranded DNA (dsDNA) breaks via homologous recombination’. 
RecBCD initiates homologous recombination by processing dsDNA 
to generate 3'-ended single-stranded DNA (ssDNA) upon recognition 
of the recombination hotspot sequence ¥, (crossover hotspot instigator 
(Chi); 5’-GCTGGTGG-3"’). The RecB and RecD subunits are SF1 heli- 
cases with 3’->5’ and 5'—>3’ translocation polarities, respectively’*™’. 
RecC holds the complex together and recognizes y'*. RecB and RecD 
drive dsDNA unwinding by acting as ssDNA motors, pulling the two 
antiparallel strands of the DNA across a pin in the RecC subunit and thus 
splitting the duplex DNA”. 

Earlier single-molecule studies of DNA unwinding by RecBCD 
revealed considerable variation in the unwinding rates of each molecule*’. 
To understand the molecular origin of this intrinsic heterogeneity, we 
analysed the unwinding behaviour of a larger set of individual RecBCD 
molecules on bacteriophage ) DNA lacking x (Fig. 1a, b). A total of 251 
molecules were initially analysed (Fig. 1c). The majority (96%) of the 


molecules did not change their speeds during unwinding (Fig. 1b). 
Individual RecBCD molecules were observed to unwind and degrade 
DNA at constant velocities for 30-60s, for over tens of thousands of 
catalytic turnovers. Although the rate distribution in earlier studies could 
be fit to a single Gaussian function**, the sizes of those data sets were 
limited; the comparatively large number of single-molecule unwinding 
rates obtained here provide clear evidence of a non-unimodal distribution 
(Fig. 1c). The distribution was fit to the sum of two Gaussian functions; the 
major population of molecules (71%) has a mean fitted rate of 1,584 + 95 
base pairs (bp) s ~ ‘(+ standard deviation (s.d.)) whereas the minor popu- 
lation (29%) has a mean rate of 907 + 500 bp s |(+s.d.). The difference 
in unwinding rates between the fast and slow populations is considerably 
beyond the experimental uncertainty. The slow population is not due to 
the recognition of y-like sequences, because such events are readily 
discerned as pauses followed by a velocity change (Supplementary 
Fig. 1). Interestingly, the fast molecules are more processive than the 
slow ones (Supplementary Fig. 2). Both the rate and processivity of the 
slow species are comparable to the behaviour of RecBCD mutants with 
a defective motor subunit", leading us to examine the single-molecule 
behaviour of two such single-motor mutant enzymes. DNA unwinding 
by RecBCD*’”’° (RecBCD* in Fig. 1c) is manifest as a single Gaussian 
distribution with a mean rate of 729 + 290 bps — 1 and for RecBk?°2cD 
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Figure 1 | Unwinding of DNA by individual RecBCD molecules is 
heterogeneous, with a fixed rate for the duration of DNA translocation. 

a, Visualization of a RecBCD unwinding an individual DNA molecule: 
experimental scheme (top) and sequential images (bottom). b, Time courses for 
unwinding DNA (lacking a x sequence) by different RecBCD molecules: black, 
absence of RecBCD; colours, individual RecBCD enzymes. Errors are standard 
error of the fit. c, Distribution of unwinding rates for wild-type RecBCD and 
motor mutants, fit to the sum of two Gaussian functions and a single Gaussian, 
respectively. The distribution of the motor mutants was summed to represent 
equal numbers of each protein. Errors are the s.d. 
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(RecB*CD in Fig. 1c) it is 432 + 227 bps | (see also Supplementary 
Fig. 2). These findings suggest that, for the wild-type enzyme, the slow 
species represents enzymes wherein one motor subunit is initially not 
engaged, but can be reversibly re-engaged when halted (see below). 

The origin of heterogeneity can be dynamic or static'*"'’. Whereas 
dynamic heterogeneity was suggested to arise from conformational 
fluctuations ofa protein, static heterogeneity can have different sources. 
It can arise from chemical heterogeneity owing to the presence of mul- 
tiple related genes, or from post-translational modifications’®. It can 
also result from enzyme molecules with identical chemical composition 
that have different stable conformational sub-states in equilibrium'*’”” 
or that are kinetically trapped in non-equilibrium states capable of mul- 
tiple turnovers’. We initiated experiments designed to distinguish 
between these possible origins. Although the protein preparation con- 
tained no detectable heterogeneity in polypeptide composition (Sup- 
plementary Fig. 3), the distributions of unwinding rates for RecBCD 
eluted from different fractions of a chromatographic elution peak were 
examined as the first trivial source of heterogeneity; no experimentally 
significant differences in the distribution profiles were found (Sup- 
plementary Fig. 4). We next considered the possibility that the hetero- 
geneity arose from RecBCD species that were not at equilibrium, but 
rather were trapped in different kinetic conformations. In an attempt to 
permit such hypothetically trapped conformations to relax to the equi- 
librium distribution, we subjected the enzyme population to experi- 
mental conditions that could potentially allow redistribution. Partial 
destabilization of protein structure, followed by refolding, can allow 
protein molecules to relax to their global minimum on the folding 
energy landscape, resulting in an equilibrium distribution of enzymes. 
We first used thermal annealing”. Ensemble assays showed that 
RecBCD could be heated to a maximum of 45 °C for 10 min, with no 
loss of activity (Supplementary Fig. 5a, b). Therefore, an enzyme popu- 
lation that was treated at 45°C, and slowly cooled at a rate of 1°C 
min’ ', was analysed using single-molecule methods. The distribution 
of the rates for the thermally treated enzymes was not statistically 
different from the original distribution (P= 0.45; Supplementary 
Fig. 5c). 

An alternative to thermal annealing is to use a chemical denaturant to 
unfold a protein, followed by slow removal, to permit refolding to the 
equilibrium distribution***. Thus, we next investigated the effect of 
partial unfolding of RecBCD by the classical denaturant guanidine 
hydrochloride (GuHCl). The enzyme could be reversibly renatured after 
treatment with up to 0.5 M GuHCl (Supplementary Fig. 6a). The velocity 
distribution of the resultant individual enzymes had a mean of 
1,736 + 133bps ' for the fast population versus 1,773 + 104bps | 
for the control (Supplementary Fig. 6b), which is the same within experi- 
mental uncertainty. The mean of the treated slow population is 
556+ 451 bps ' versus 793 + 307 bps ‘ for the control population; 
although the mean for the slower group seems to be reduced, the differ- 
ence is not significant (P = 0.24). In conclusion, neither thermal anneal- 
ing nor chemical refolding produced a more homogeneous distribution, 
indicating that either these treatments are insufficient to permit redis- 
tribution, or that the population of RecBCD enzymes is intrinsically 
heterogeneous. 

It remained possible that the conformational distribution of 
RecBCD enzyme was, in fact, at equilibrium owing to the presence 
of multiple conformations of similar free energy”’, but the binding of 
substrates could lock an enzyme in a given conformation”. For 
RecBCD, each DNA binding event allows unwinding of tens of thou- 
sands of base pairs, perhaps suggesting that the initial binding locks the 
enzyme in a conformation that lasts the duration of the unwinding 
process—a form of conformational selection”. Given that we had been 
unable to alter the distribution of RecBCD enzyme rates by more 
traditional means, we next examined whether depletion of a ligand, 
ATP, permitted a change to an altered conformation while bound to 
the DNA. Consequently, we stopped individual RecBCD molecules 
during the course of unwinding by depleting this essential cofactor, 
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and then measured the rate upon reintroduction of the ligand and 
restarting the same enzyme. This was achieved by first moving a single, 
optically trapped enzyme-DNA complex into the reaction channel 
containing ATP to initiate unwinding. After a length of time sufficient 
to accurately determine the rate of DNA unwinding (~ 10s), the com- 
plex was moved to a third channel that contained 10 mM EDTA, but 
neither Mg”* nor ATP, to stop unwinding. After a defined length of 
time, the arrested RecBCD-DNA complex was moved back to the 
reaction channel to resume unwinding. By halting RecBCD in this 
manner for 20s, we found that about 50% (173 out of 354) of com- 
plexes restarted unwinding when moved back to the reaction channel; 
we presume that RecBCD dissociated from the remainder. Fig. 2a 
shows the time courses for three characteristic RecBCD molecules. 
For molecule 1, the unwinding rate decreased from 1,443bps ' to 
507 bps‘; for molecule 2, it was the same upon resumption; and 
for molecule 3, it increased from 1,447 bps | to 1,648 bps |. After 
the 20-s incubation in EDTA, 53% (91 out of 173) of the molecules 
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Figure 2 | The DNA unwinding rate of single enzymes is stochastically 
changed to a velocity within the original distribution, after transient 
depletion of Mg”*-ATP. a, DNA unwinding by three representative Re-BCD 
enzymes. The grey block indicates the pause duration. Errors are standard error 
of the fit. b, The rates before and after pausing (n = 173). Error bars represent 
the standard error of the fit. c, Distribution of rates before (blue) and after (red) 
pausing for molecules with an initial rate of 1,450-1,550 bps ~ t (blue box, panel 
b; n = 36). Before pausing, the selected bin had a mean velocity of 

1,493 + 27 bps ' (s.d.); after pausing and redistribution, the mean velocity was 
1,245 + 453 bp s | (s.d.) (median = 1,411 bp s })\.d, Proportion of molecules 
that changed rates after pausing plotted versus pause duration and fitted to an 
exponential curve; error bars are expected bounds assuming a binomial 
distribution of switching events. e, Scatter plot of the relative rate changes after 
two pauses (n = 34). 
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continued unwinding with the same rate (within a 20% difference), 
whereas 35% (n = 61) of molecules slowed, and 12% (n = 21) of mole- 
cules increased, speed (Fig. 2b). This finding shows that the rate of 
individual RecBCD molecules is not static and the heterogeneity in 
rates cannot be, at least not solely, due to variation in covalent or 
irreversibly trapped structures. Note that when DNA unwinding was 
observed in the continuous presence of ATP (Fig. 1b), spontaneous 
rate-change events (Supplementary Fig. 1) were rare (4%) and attribu- 
table to y-like recognition events. By contrast, when unwinding was 
interrupted by transiently removing ATP, at least 47% of the enzyme 
molecules resumed unwinding at a different rate upon re-introduction 
of ATP (Fig. 2a, b), suggesting that omission of the ATP ligand per- 
mitted a conformational switch that affects the rate-limiting trans- 
location behaviour of RecBCD. These results support the notion that 
ligand binding locks the enzyme into a conformational state that typi- 
cally persists for the duration of a single processive DNA unwinding 
transaction, whereas the absence of ATP allows the enzyme molecule 
to change its conformation state within the time it was halted. 

The blue box in Fig. 2b highlights a binned region of the single-molecule 
velocity distribution containing a relatively well-populated group of mole- 
cules (n = 36) that translocated at rates between 1,450 and 1,550 bp s ! 
before pausing. After incubation in EDTA, the velocities became broadly 
redistributed, ranging from 300 bp s | to 1,900 bp s |. The new distri- 
bution of velocities for this group is similar to the starting distribution for 
all the molecules (Fig. 2c and Supplementary Fig. 7), although the new 
distribution is overrepresented by molecules that switched to the slow 
macrostate (that is, with one motor disengaged). This finding demon- 
strates that an enzyme molecule with a fixed velocity can switch to any 
other velocity that was initially displayed by other enzymes in the ori- 
ginal ensemble; similar redistribution was seen for other well-populated 
bins of molecules (Supplementary Fig. 7b, c). These findings indicate 
that all of the conformational sub-states of the ensemble are accessible to 
an enzyme after pausing. The velocity after ligand depletion is not related 
to the starting velocity of the enzyme, but rather, each enzyme equili- 
brated to a new velocity that was represented in the initial ensemble. The 
velocity distributions for enzymes, both before and after arrest, are not 
unimodal although, after being halted, the percentage of molecules in the 
slow group increases (Supplementary Fig. 7a). These results indicate that 
a RecBCD molecule can adopt any conformation on the free energy 
landscape, after being subjected to transient depletion of ATP. To ensure 
that the rate changes were not specific to the pausing by EDTA, experi- 
ments were conducted by stopping the RecBCD-DNA complex in a 
channel devoid of ATP but containing Mg*”. Similar results were 
obtained (Supplementary Fig. 8). 

When the duration of the time arrested without ATP was decreased 
to 2s, the percentage of complexes that resumed unwinding increased 
to 78%, although fewer (33%) switched velocity (Supplementary Fig. 9). 
Upon increasing the incubation time in the EDTA channel, the pro- 
portion of molecules that changed rate increased exponentially with a 
half-life of 1.3 + 0.4 s (Fig. 2d), suggesting that a conformational change 
responsible for the change in velocity in the absence of ATP requires 
~1s. The combined data set for all pauses (Supplementary Fig. 9d; 
n = 445) shows that, with some underrepresentation of the slow start- 
ing velocities, there is switching from any one microstate to any other 
microstate. Given the existence of two macrostates (the fast population 
with two motors attached, and the slow population with one motor 
attached), when velocity switches that occur only within a macrostate 
are considered, the velocity redistributions are completely random (Sup- 
plementary Fig. 9e, f). Because the rate of ATP hydrolysis is rapid (ran- 
ging from a few hundred to a few thousand per s) relative to the half-life 
for the conformational change (1.3 s), the time between two adjacent 
ATP binding events would be too short (on the order of ms) for the 
unliganded apo-form of RecBCD to adopt a different conformation 
during the time that ADP has dissociated and before ATP has re-bound. 
For this reason, we presume that spontaneous switching is rare. Our 
interpretation is in accord with an earlier study which found that a few 
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individual RecBCD enzymes can spontaneously change velocity when 
examined at low (15 uM) ATP”. At such a low ATP concentration, the 
apo-form of the enzyme is longer lived and the time between adjacent 
ATP binding events would be ~67-fold longer than for the studies in this 
report, making it more likely that RecBCD could switch to a new con- 
formational sub-state. Therefore, we conclude that the binding of ATP 
and DNA to RecBCD fixes the conformational state, which in turn 
defines the unwinding rate for the duration of a single processive 
unwinding event, contributing to the observed heterogeneity in rates 
of (and between) individual enzymes. 

To determine whether the conformational changes are stochastic 
for any individual molecules, we halted some enzyme molecules twice 
using the same procedure, and asked whether the rate changes after 
each interruption were correlated. The individual molecules (n = 34) 
exhibited both decreased and increased rates after each pause, as seen 
above (Supplementary Fig. 10), and we found no correlation between 
the relative changes in rate as the results of the two consecutive pauses 
(Fig. 2e). 

Earlier studies on the behaviour of other single enzymes have reported 
static heterogeneity in catalytic rates owing to variation in the covalent 
structures'’, the presence of metastable conformations’*'””” or dynamic 
heterogeneity caused by conformational fluctuation'®. In this work, we 
found that the heterogeneity in the DNA unwinding rates by RecBCD is 
static on the experimental time scale of DNA unwinding for tens of 
thousands of base pairs. However, the rates are not intrinsic to individual 
molecules; thus, the heterogeneity cannot be explained by possible vari- 
ations in the covalent structures of the enzyme. Instead, any individual 
molecule can adopt any conformation within the initially accessible free 
energy landscape after depletion of a ligand for a few seconds. The 
ergodic hypothesis posits that the (infinite) time-averaged behaviour 
of a molecule at equilibrium is equal to the ensemble-average of an 
infinite collection of those molecules. Thus, if a single enzyme molecule 
could be repeatedly stopped and observed, it should adopt all the pos- 
sible conformations that are accessible for those conditions of ther- 
modynamic state. Clearly, we cannot examine a single molecule for an 
infinite number of times, but a corollary of the ergodic hypothesis is that 
if one could watch any single molecule in an equilibrium distribution 
that could randomly switch at least once to a new conformation, then the 
distribution of those new states should recapitulate the original distri- 
bution, if indeed the first distribution was at equilibrium. By watching a 
collection of individual enzymes switching a limited number of times, 
here we show that they can switch to velocities found in the original 
distribution. Therefore, we conclude that these seemingly static RecBCD 
molecules can switch into microstates existing within the original 
ensemble. Also, when transitions remain within each macrostate, the 
new distribution of velocities is completely random, manifesting an 
expectation of ergodic behaviour. Unexpectedly, the lifetimes of these 
kinetic states are atypically long, and are dictated by ligand occupancy. 
We imagine that the conformation of the enzyme is dynamic in the 
absence of ligands and that a single conformation is selected and stabi- 
lized, that is, made seemingly static, upon ligand binding”. These find- 
ings help us to understand the influence of ligand binding on protein 
conformations, conformational selection and enzymatic reactions, and 
they now raise the intriguing structural question of how sub-states that 
vary in speeds by hundreds of base pairs per second can be maintained 
by these quasi-stable enzymatic conformations. Finally, the possible 
biological function of heterogeneity in a population of individual mole- 
cules is unknown and is difficult to define. However, we offer the plau- 
sible speculation that the variation seen for populations of individual 
molecules is akin to the epigenetic variation in the populations of organ- 
isms. Given the stochastic nature of life, a population of cells—bacteria in 
this specific case—needs both diversity and flexibility to respond to the 
random nature of natural challenges. We suggest that the variation in 
individual molecule behaviour affords a molecular plasticity in the cel- 
lular functions of RecBCD to respond to unpredictable needs. RecBCD 
has two seemingly contradictory functions: one is the degradation of 
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foreign duplex DNA (for example, DNA viruses) and the other is the 
repair of broken chromosomal DNA*. The regulation of these activities is 
controlled by recognition of the DNA regulatory sequence x. Each E. coli 
cell contains only ten RecBCD enzyme molecules, and each cell suffers 
~0.5 DNA breaks per cell cycle and is exposed to an unpredictable 
amount of phage or foreign DNA. If RecBCD were limited to one con- 
formation, or if it could adopt multiple conformations but these confor- 
mations rapidly equilibrated after each step of processive unwinding, 
then all DNA would be processed at the same rate. Given the probabilistic 
nature of DNA breaks and the appearance of foreign DNA, conforma- 
tional heterogeneity coupled with conformation selection of a kinetically 
stable functional form of RecBCD can ensure a stochastic but broad 
cellular response. Consequently, if the few RecBCD molecules present 
can adopt a wide range of conformational states, then survival through 
random selection is more likely, and the surviving cells, within a popu- 
lation of cells, are not constrained genetically. By coupling dynamic 
disorder in the ensemble with subsequent random selection of conforma- 
tions that remain static during processive DNA unwinding, both mole- 
cules and cells can respond probabilistically to unpredictable situation 
with just a handful of molecules. From the perspective of a population of 
cells, although some will perish, a random fraction will have survived by 
throwing the dice productively. 


METHODS SUMMARY 


Single-molecule DNA helicase reactions were performed using an optical trapping 
and microfluidics system as reported””® with minor modification. For the pausing 
experiments, a three-channel flow cell was used. The first channel contained bead— 
DNA complexes and 2mM Mg(OAc), in single-molecule buffer (SMB; 45 mM 
NaHCO; (pH 8.3), 15% (w/v) sucrose, 50 mM dithiothreitol and 20 nM YOYO-1 
dye). The second channel contained 1 mM ATP and 2mM Mg(OAc), in SMB. 
The third channel contained 10 mM EDTA or 2mM Mg(OAc), in SMB. 

For comparison of the rate distributions, the two-sample Kolmogorov-Smirnov 
test was used. For correlation analysis, Spearman rank correlation test was used. 
All P values reported for statistical analysis refer to the two-tailed probability of the 
tests. 


Full Methods and any associated references are available in the online version of 
the paper. 
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METHODS 


Proteins and DNA substrates. E. coli RecBCD enzyme was expressed and purified 
as described previously*!**. To check purity, protein was analysed using a 12% 
denaturing polyacrylamide gel (1:29 bis:acrylamide in TBE buffer (89 mM Tris base, 
2mM EDTA, 89 mM boric acid), containing 10% SDS) stained with Coomassie blue 
dye. After electrophoresis, the gel was imaged using an AlphaInnotech gel docu- 
mentation system. The two mutant enzymes, RecBCD™!7”2 (RecBCD*) and 
RecBX°2cD (RecB*CD), were purified as described"®. 

Bacteriophage 2 DNA (N°-methyladenine-free lambda DNA, New England 
Biolabs) was biotinylated by ligation to a 3'-biotinylated 12-mer oligonucleotide 
(cosA: 5'-GGGCGGCGACCT-3’ or cosB: 5'-AGGTCGCCGCCC-3’, Operon 
Technologies) that is complementary to one of the cohesive ends of 4 DNA’; 
except for the thermal re-annealing and control experiments, where cosA was 
used, all other experiments used the cosB oligonucleotide. 

The pUC19 plasmid DNA was purified by caesium chloride gradient centrifu- 
gation. The circular DNA was linearized with Ndel restriction endonuclease (New 
England Biolabs) followed by heat inactivation and phenol/chloroform/isoamyl 
alcohol extraction. The DNA concentration was determined by absorbance at 
260 nm using an extinction coefficient of 6,330 M_! (nucleotides) cm7!. 

ATP hydrolysis assays. The ATP hydrolysis activity of the enzyme was measured 
spectrophotometrically as reported” by coupling ATP hydrolysis to NADH oxidation™* 
using an Agilent Technologies Model 8452A diode array spectrophotometer. The 
assay mixtures contained 25 mM Tris acetate (pH 7.5), 1 mM dithiothreitol (DTT), 
2mM ATP, 5 mM magnesium acetate, 1.5 mM phosphoenolpyruvate, 0.2 mg ml! 
NADH, 30Um! ! pyruvate kinase, 30 Uml * lactate dehydrogenase and 50 1M 
(nucleotides) poly(dT). Reactions were initiated by the addition of 0.5 nM RecBCD 
enzyme after pre-incubation of all other components at 37 °C for 5 min. The rate of 
ATP hydrolysis was calculated from the rate of change in absorbance at 340 nm due 
to oxidation of NADH using the following conversion: rate of A349 nm decrease 
(s_') X 9,820 + 0.0005 (11M RecBCD) ~ 60 = rate of ATP hydrolysis (s— 133, 
Re-purification of RecBCD. RecBCD enzyme (0.1 mg) from —80°C freezer 
stock was thawed on ice, diluted fivefold using cold B100 buffer (20mM Tris- 
HCl (pH7.5), 0.1 mM EDTA, 0.1 mM DTT and 100 mM NaCl), and loaded onto a 
1-ml MonoQ column (Amersham Biosciences). The enzyme was eluted using a 
gradient from 300 mM to 450mM NaCl in 30 column volumes. Three fractions 
(100 pl each) on one side of the peak in ultraviolet absorbance were used imme- 
diately for single-molecule helicase assays. 

Stopped-flow dye-displacement helicase assay. Essentially, the protocols used 
previously'* were followed. Experiments were performed in an Applied Photophysics 
SX.18MV-R stopped-flow apparatus with excitation at 355 nm (bandwidth 9.3 nm) 
and emission was measured using a 450 nm long-pass filter. Unless stated otherwise, 
all reported concentrations are final after mixing of equal volumes in the stopped- 
flow apparatus. Reactions were performed at 25 °C in a buffer containing 25 mM Tris 
acetate (pH 7.5), 6 mM magnesium acetate, 1 mM DTT, 200 nM Hoechst 33258 dye 
(Molecular Probes) and 300nM ssDNA-binding protein (SSB). The RecBCD 
enzyme, at the final concentration indicated, was incubated with 0.05 nM (molecules) 
Ndel-cut pUC19 DNA (equivalent to 0.1nM RecBCD binding sites) for 5 min, and 
this was then mixed with 2 mM ATP to initiate the reaction. Data were analysed using 
GraphPad Prism 5.02 (GraphPad Software). Unwinding rates were determined by a 
linear fit to the first 2 s of each trace. 

Thermal treatment of RecBCD. Aliquots of the RecBCD enzyme in storage 
buffer (20 mM Tris-HCl (pH 7.5), 0.1 mM EDTA, 0.1 mM DTT, 100 mM NaCl 
and 50% (v/v) glycerol) were thawed on ice and then heated to 45 °C for 10 min 
followed by slowly cooling by 1 °Cmin~’ down to 4 °C using GeneMate PCR 
machine. The untreated controls were kept on ice until use. 

Chemical unfolding of RecBCD. Aliquots of the RecBCD enzyme were thawed 
on ice. Various concentrations of GuHCl] were mixed in 1:1 volume ratio with the 
enzyme. After incubation at room temperature (~23 °C) for 1h, the sample was 
dialysed against B100 buffer (20 mM Tris-HCl (pH7.5), 0.1 mM EDTA, 0.1 mM 
DTT and 100 mM NaCl) at 4 °C for 24h and the dialysis buffer was changed once. 
The next day, samples were collected and the concentrations were measured after 
centrifugation. Samples were taken for ATPase assays, and the rest were used for 
single-molecule assays. 

Optical trapping and fluorescence microscopy. Single-molecule DNA helicase 
reactions were performed using an optical trapping system as reported’ with some 
modifications’. The system is constructed around a Nikon Eclipse microscope 
(Nikon). A high-pressure mercury lamp (100 W; USHIO America) and Y-FL 4-cube 
Epi-Fluorescence (Nikon) attachment were used for illumination. Images were cap- 
tured using a high sensitivity electron bombardment couple-charged device (CCD) 
camera (EB-CCD C7190; Hamamatsu Photonics) and digitalized online using an 
LG-3 frame grabber (Scion Corporation) at 30 framess '. The optical trap was 
created by focusing a 1,064 nm laser (Nd:YVO4, 6 W max, J-series power supply; 
Spectra Physics) through a high numerical aperture (NA) objective (X100/1.3 oil 


DICH; Nikon). A high NA objective is necessary to create an intensity gradient 
sufficiently large to form the trap’. The laser is expanded with a 20X beam expander 
(HB-20XAR.33; Newport) to fill the back aperture of the objective. The laser is 
collimated and aligned using a 1 telescope. The laser is reflected along the optical 
axis of the microscope by means of a low-pass dichroic mirror placed between the 
objective and the fluorescence cube. 

Experiments were carried out in a multi-channel microfluidic flow cell secured on 
a computer controlled motorized stage (MS-2000; Applied Scientific Instruments) 
mounted on the microscope. The design of the flow cell allows laminar flow of 
different solutions without mixing. The solutions are introduced into the flow cell 
by a syringe pump with multiple syringes (KD Scientific), generating a flow rate of 
~100-150 jum s~'. PEEK tubing (Upchurch Scientific) is used to connect the syr- 
inges to the flow cell. The microfluidic system permits imaging of protein-DNA 
complexes on a single molecule of flow stretched DNA; it also enables the rapid 
movement of the sample to the different buffers in the channels of the flow cell. The 
position of the stage and, hence the flow cell, is controlled using a custom-built 
program. Bead~DNA complexes can be moved between adjacent solution channels 
within 1s via the movement of the stage. For the pausing experiments, a three- 
channel flow cell was used. 
Single-molecule DNA helicase reactions. The protocol used for DNA-bead 
preparation was modified from that used previously*’. Biotinylated 4 DNA 
(100 pM in 1-2 pl) was incubated with 1-2 pl of 1 um ProActive streptavidin- 
coated microspheres (~35 pM; Bangs Laboratories) for 1h on ice or at 37 °C. 
Bead-DNA complexes were then transferred into 0.5ml of de-gassed sample 
solution containing 45 mM NaHCO; (pH 8.3), 20% (w/v) sucrose, 50 mM DTT 
and 20 nM YOYO-1 dye (Molecular Probes). DNA was incubated with the dye for 
at least 1h in the dark at room temperature. Immediately before transfer to the 
sample syringe, magnesium acetate and RecBCD, to final concentrations of 2 mM 
and 10-60 nM, respectively, were added to the sample mixture. For the control, the 
RecBCD storage buffer without RecBCD was used to replace the enzyme solution. 
The reaction solution contained 45 mM NaHCO; (pH 8.3), 20% (w/v) sucrose, 
50mM DTT, 1 mM ATP, 2 mM magnesium acetate and 20 nM YOYO-1 dye. For 
the pausing experiments, the three-channels were as follows. The first channel 
contained bead—-DNA complexes and 2 mM Mg(OAc), in SMB (45 mM NaHCO, 
(pH 8.3), 15% (w/v) sucrose, 50 mM dithiothreitol and 20 nM YOYO-1 dye). The 
second channel contained 1mM ATP and 2mM Mg(OAc), in SMB. The third 
channel contained either 10mM EDTA or 2mM Mg(OAc), in SMB; the two 
solutions used as indicated were either: 45mM NaHCO, (pH 8.3), 15% (w/v) 
sucrose, 5}0mM DTT, 10 mM EDTA and 20nM YOYO-1 dye, or 45mM 
NaHCO; (pH 8.3), 15% (w/v) sucrose, 50 mM DTT, 2 mM magnesium acetate 
and 20 nM YOYO-1 dye. 
Single-molecule data analysis. Videos were digitalized through an LG-3 frame- 
grabber card using an Image] plugin. Images were then averaged and the length of 
the DNA molecule in each frame was measured using a custom-built ImageJ 
plugin”. The experimental data were fitted to either a line or a three-segment line 
using Origin 7.5 (OriginLab Corp.) or GraphPad Prism 5.02 (GraphPad Software, 
Inc.). The translocation rates of RecBCD were calculated from the slopes of the 
corresponding segments and the standard error of the best-fit values are reported. 
Unless otherwise indicated, standard deviation is reported for statistical analysis ofa 
number of molecules. The analysis method has an estimated resolution of 50 bps |. 
The difference in the unwinding rates between the fast and slow populations is 
significantly beyond the experimental uncertainty. When a distribution of unwind- 
ing rates was plotted, the rates were grouped in 100 bps’ ' bins. The distributions 
were fit to the sum of two Gaussian curves, unless otherwise noted. Error bars in 
Fig. 2d represent the expected bounds assuming a binomial distribution of switch- 
ing events for the given sample size. For comparison of the rate distributions, the 
two-sample Kolmogorov—Smirnov test was used. For correlation analysis, Spear- 
man rank correlation test was used. All P values reported for statistical analysis refer 
to the two-tailed probability of the tests. 
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Folate receptors (FRa, FRB and FRy) are cysteine-rich cell-surface 
glycoproteins that bind folate with high affinity to mediate cellular 
uptake of folate. Although expressed at very low levels in most tissues, 
folate receptors, especially FRa, are expressed at high levels in numer- 
ous cancers to meet the folate demand of rapidly dividing cells under 
low folate conditions’ *. The folate dependency of many tumours has 
been therapeutically and diagnostically exploited by administration 
of anti-FRa antibodies, high-affinity antifolates*”, folate-based imaging 
agents and folate-conjugated drugs and toxins**. To understand 
how folate binds its receptors, we determined the crystal structure of 
human FRa in complex with folic acid at 2.8 A resolution. FRa has a 
globular structure stabilized by eight disulphide bonds and contains 
a deep open folate-binding pocket comprised of residues that are 
conserved in all receptor subtypes. The folate pteroate moiety is buried 
inside the receptor, whereas its glutamate moiety is solvent-exposed 
and sticks out of the pocket entrance, allowing it to be conjugated to 
drugs without adversely affecting FRa binding. The extensive inter- 
actions between the receptor and ligand readily explain the high 
folate-binding affinity of folate receptors and provide a template 
for designing more specific drugs targeting the folate receptor system. 
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Folates (vitamin By) are important one-carbon donors for the syn- 
thesis of purines and thymidine—essential components of nucleic 
acids—and indirectly, via S-adenosyl methionine, for methylation of 
DNA, proteins and lipids’. Folate deficiency is therefore associated 
with many diseases, including fetal neural tube defects, cardiovascular 
disease and cancers’®. In adult tissues, folate is mainly taken up by 
reduced folate carrier, a ubiquitously expressed anion channel that has 
relatively low folate-binding affinity (K,, = 1-10 uM)'’. By contrast, 
high-affinity uptake of the food supplement folic acid (Kg < 1 nM)” 
and the physiologically prevalent folate N°-methyltetrahydrofolate (5- 
mTHF) requires the function of three subtypes of folate receptor (FRa, 
FRB and FRy), which are cysteine-rich glycoproteins that mediate 
folate uptake through endocytosis. Inside of the cell, the acidic envir- 
onment of the endosome promotes the release of folate from receptors, 
which is then transported into the cytoplasm by proton-coupled folate 
transporter’*. The expression of folate receptors is largely restricted 
to cells important for embryonic development (for example, placenta 
and neural tubes) and folate resorption (kidney). Among the three 
FR isoforms, FRa is the most widely expressed, with very low levels 
in normal tissues, but high expression levels in many tumours". As 


Figure 1 | Structure of FRa bound 
to folic acid. a, Two views of the 
complex, with FRo in green, folic 
acid in grey, NAG in orange and the 
disulphide bonds depicted as yellow 
sticks. The N and C termini are 
labelled. b, Ribbon diagram of FRa, 
with folic acid and NAG in green 
stick presentations, overlaid with the 
semi-transparent receptor surface. 

c, Charge distribution surface of FRa 
with a close-up view of the ligand- 
binding pocket entrance. Folic acid 
carbon atoms are coloured grey, 
nitrogen atoms blue, and oxygen 
atoms red. A colour-code bar 
(bottom) shows an electrostatic scale 
from —3 to +3 eV. 
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Figure 2 | Structural and biochemical analysis of FRa-folic acid 

interactions. a, The cA-weighted 2F, - F, electron density map for folic acid, 
shown as a grey mesh. b, The internal charge distribution surface of the binding 
pocket is shown using the same colour code as in Fig. 1c, with folic acid shown 


such, FRa has become the molecular target for the development of 
many cancer therapeutics, including anti-FR« antibodies, high-affinity 
anti-folates, folate-based imaging agents and folate-conjugated drugs 
and toxins. Despite intense research on the folate structure-activity 
relationship, the molecular basis for the high-affinity recognition 
of folates by FRa remains elusive owing to the technical difficulties 
in expression, purification and crystallization of FRx for structural 
studies. 

To obtain FR« protein for structural studies, we stably expressed 
human FR lacking its carboxy-terminal glycophosphatidylinositol 
anchor as a secreted IgG Fc fusion protein (FRa-Fc) in HEK293 cells. 
As fully glycosylated fusion protein purified from culture medium 
yielded poorly diffracting crystals, we reduced crystallization-inhibiting 
glycosylation heterogeneity by combined treatment with kifunensine 
and endoglycosidase H, which together reduce complex carbohydr- 
ates to single N-acetylglucosamine (NAG) moieties!’ (Supplementary 
Fig. la, b). The deglycosylated FRa-Fc had a similar folic acid-binding 
affinity (~190 pM) to the fully glycosylated protein (Supplementary 
Fig. 1d) and yielded crystals, which diffracted to 2.8 A (Supplementary 
Fig. 1c). We solved the structure by combining the phase information 
from one Pt derivative and six native S anomalous data sets (see 
Methods and Supplementary Table 1). 
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in stick presentation. c, Folic acid-binding network with close-ups of the folic 
acid head and tail groups. Residues that line the binding pocket are shown in 
green and folic acid is shown in grey. Hydrogen bonds are indicated by dashed 
lines. 


FR¢ has an overall globular structure, comprising four long «-helices 
(a1, «2, «3, 06), two short a-helices (#4, #5), four short B-strands 
(B1-B4) and many loop regions (Fig. la, b). The tertiary structure is 
greatly stabilized by eight disulphide bonds formed by 16 conserved 
cysteine residues (C15-C43, C35-C83, C44—-C87, C67-C153, C74- 
C124, C113-C187, C117-C167 and C130-C147). FRa has three pre- 
dicted N-glycosylation sites at N47, N139 and N179. Clear electron 
density for NAG is observed for N47 and N139, and partial electron 
density for N179. The overall fold of FRo is similar to that of riboflavin- 
binding protein (22% sequence identity to FR«)'®, with a root mean 
squared deviation of 1.56 A for 163 Cox atoms, but the two proteins have 
very differently shaped ligand pockets and ligand-binding modes (Sup- 
plementary Fig. 2). 

The core domain consists of helices «1, «2, 73 and «5 tied together 
by four disulphide bridges (C35-—C83, C44—C87, C74-C124 and C117- 
C167; Fig. 1a). The structure of FRx contains a long and open folate- 
binding pocket, which is formed by «1, «2 and «3 in the back; the 
amino-terminal loop, B1 and B2 in the bottom; the «1-02 and «3-04 
loops in the left and top; and «4, «5, B4 and £3 in the right (Fig. 1a, b). 
Folic acid is oriented with its basic pteroate moiety docked deep inside 
of the negatively charged pocket and the two negatively charged carboxyl 
groups of its glutamate moiety sticking out of the positively charged 
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entrance of the ligand-binding pocket, which is formed by the «1-«2, 
B1-B2 and «3-a4 loops (Figs 1 and 2b). 

Clear electron density was observed for folic acid and the surround- 
ing amino acid residues, which allowed for accurate modelling of the 
ligand and its interacting residues lining the binding pocket (Fig. 2a 
and Supplementary Fig. 3). The cross-section of the binding pocket 
reveals the complementary shape and charge between the bound 
ligand and the receptor (Fig. 2b). Folic acid docks into an extended 
groove of FR in the direction roughly perpendicular to the plane 
formed by helices «1, «2 and «3, with the pterin head group buried 
inside against the back formed by «1, «2 and «3 (Figs 1a and 2b, c). The 
interactions around the pteroate moiety contain both hydrogen bonds 
and hydrophobic interactions. First, the pterin ring is stacked between 
the parallel side chains of Y85 and W171, and capped by Y175 (Fig. 2c). 
Second, the hydrophilic pterin ring N and O atoms form a series of 
hydrogen bonds with receptor residues. Specifically, the pterin N1 and 
N2 atoms form strong hydrogen bonds with the side-chain carboxyl 
group of D81, the N3 and 04 atoms with the $174 hydroxyl group, the 
O04 atom forms two hydrogen bonds with the guanidinium groups of 
R103 and R106, and the N5 atom forms one hydrogen bond with the 
H135 side chain (Figs 2c and 3a). Interestingly, folic acid 04 is replaced 
by an amino group in the antifolates methotrexate and aminopterin, 
which have reduced affinity for FRo*’’”. The amino group would not 
allow for the formation of hydrogen bonds with R103 and R106 and 
would sterically clash with the position of R103 (see Fig. 3a) in the folic 
acid-bound structure, providing a structural rationale for the poor 
FRa-binding of these two compounds and their preferential uptake 
by reduced folate carrier. 

The folic acid aminobenzoate is stabilized by hydrophobic interac- 
tions with Y60, W102 and W134, which line the middle of the long ligand- 
binding pocket (Fig. 3a). Extensive interactions are also observed for 
the glutamate group, which engages six hydrogen bonds, contributed 
by the side chains of W102, K136 and W140, as well as by backbone 
interactions with H135, G137 and W138 (Figs 2c and 3a). Most resi- 
dues involved in ligand binding are identical among different subtypes 
of FR regardless of their origins (Supplementary Figs 4 and 5), indi- 
cating that the observed folate-binding interactions are probably con- 
served in all three different receptor subtypes. In addition, the most 
physiologically prevalent folate, 5-mTHF, can be easily docked into the 
FR« ligand-binding pocket in a mode very similar to that of folic acid, 
suggesting that the fundamental mechanism of folate recognition is 
conserved (Supplementary Fig. 6). 

To validate the structure observations, we examined the ligand- 
binding affinities of FRx mutants that have alanine mutations in the 
key folate-contacting residues. The W171A mutation abolished the 
expression of the receptor (Supplementary Fig. 7a), suggesting that 
this residue is critical for protein stability. All other mutants expressed 
relatively well and were purified to determine their folate-binding 
affinity by radioligand-binding assay (Supplementary Figs 7b and 
8b). Whereas wild-type FRa bound to PH]-folic acid with a Ky of 
~0.19 nM, replacement of D81 decreased affinity by more than one 
order of magnitude, consistent with the strong interaction of the aspar- 
tate carboxyl oxygens with the pterin N1 and N2 nitrogens, and indi- 
cating that this interaction is a key contributor to high-affinity ligand 
binding. By contrast, mutations of Y175, K136 and R106 (bond 
lengths = 3.1 A) have little effect, and mutations of all other ligand- 
binding residues (hydrogen bonds = 3.0 A) have only moderate effects 
on folic acid binding (affinity deceases of = 3.6-fold), which are approxi- 
mately additive for the double mutants R103A/S174A and W102A/ 
R103A. This extensive interaction network therefore makes FRa-folic 
acid binding remarkably resistant to single amino acid substitutions 
(Fig. 3b and Supplementary Figs 7b and 8b). Together, the structural 
and mutational analyses present a structural rationale for the absolute 
requirement of the pterin group for anchoring folate in the binding 
pocket of the receptor and for the availability of the glutamate group 
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Figure 3 | Folic acid affinities of FRa ligand-binding-pocket mutants. 

a, Interaction map of folic acid with ligand-binding-pocket residues. The folic 
acid chemical structure is shown in magenta, pocket residues in black and 
hydrogen bonds as green dashed lines with bond distances (A) indicated. 
Hydrophobic interactions are presented as curved red lines. The pteroate and 
glutamate moieties of folic acid are indicated above the map. b, Folic acid 
affinities of wild-type and mutant FRo proteins as measured by [*H]-folic acid 
binding assay (see Supplementary Figs 7 and 8 for binding isotherms). The 
numbers on top of the bars indicate the fold decrease in affinity (increase in Kg) 
relative to wild-type FRa. Error bars indicate s.d. (n = 2). 


for conjugation with drugs and imaging reagents'*, without adversely 
affecting the interactions between receptor and ligand. 

In summary, many cancers highly express FRa, which has therefore 
become an important target for receptor-mediated chemotherapy. 
How ER binds to folate and folate-conjugated drugs, however, has 
remained unknown. The FRa-folic acid complex structure illustrates 
how the receptor assumes a deep folate-binding pocket that is formed 
by conserved residues across all receptor subtypes and provides detailed 
insights into how folic acid interacts with its receptors. Together, these 
observations establish a rational foundation for designing specific drugs 
targeting the folate receptor system. 


METHODS SUMMARY 


FR«. was stably expressed in HEK293 cells as a secreted IgG Fc fusion protein in cell 
medium supplemented with folic acid and kifunensine. Proteins were purified 
from the conditioned cell media by nickel-nitrilotriacetic acid chromatography 
(Ni-NTA), followed by proteolytic release of the Fc tag, enzymatic deglycosylation 
and size-exclusion chromatography. Crystals were grown by vapour diffusion. 
One native data set, one Pt derivative data set and six S anomalous signal data sets 
were collected from cryo-protected crystals at beamlines 21-ID-D at the Advanced 
Photon Source at Argonne National Laboratories. 
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Initial phases were established using the SHELX program” with Pt-soaked 
derivative data and S anomalous data. Density modification for the initial electron 
density map was performed using DM”. A crude model was built automatically 
using the CCP4 program buccaneer, improved by manual building using Coot”! 
and using the riboflavin-binding protein structure as a reference. The eight mole- 
cules of FR model were located in one asymmetric unit by molecular replacement 
and further refined using the Refmac program of CCP4 (ref. 22). 

Mutant proteins were expressed in HEK293 cells after transient transfection 
and purified using Ni-NTA chromatography. Folic acid binding was determined 
by saturation radioligand-binding assay using *H-labelled folic acid. 


Full Methods and any associated references are available in the online version of 
the paper. 
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METHODS 


Protein expression and purification. The human FR« (residues 23-234) com- 
plementary DNA excluding the secretion signal peptide (residues 1-22) and glyco- 
phosphatidylinositol anchor signal peptide (residue 235-257) was expressed as a 
human IgG Fc fusion protein from the expression vector pcDNA6. This construct 
also contained a murine Igk leader sequence at the N terminus to allow target 
protein secretion into media supernatant, a thrombin cleavage site between FRa 
and Fe, and a His tag after the Fc tag. For small-scale expression, HEK293 cells 
were transiently transfected with the FRa-Fc DNA. Media supernatants were 
collected after 4 days and dialysed against 20 mM Tris, pH 8.0, 0.15 M NaCl, 5% 
glycerol before nickel-nitrilotriacetic acid (Ni-NTA) chromatography. For large- 
scale purification, a stable HEK293 cell line expressing FRo-Fc was established by 
selection of HEK293 cells transiently transfected with FRa—Fc DNA in the presence 
of 10 pg ml’ * blasticidin (Invitrogen). Single colonies were grown in 24-well plates 
and expression of secreted FRo-Fc fusion protein in cell media supernatants was 
examined by biolayer interferometry using an Octet Red instrument (ForteBio) 
and by immunoblot analysis. 

For large-scale purifications, a stable clone was maintained in 500 ml of DMEM 
supplemented with 5% fetal bovine serum, 20 mM HEPES, 5 .M kifunensine and 
200 UM folic acid in one-litre roller bottles at 37 °C. Two litres of conditioned media 
were collected, concentrated to 400 ml and dialysed against buffer C (25 mM Tris, 
pH8.0, 150 mM NaCl, 1 11M folic acid) at 4 °C overnight before loading on a 50-ml 
Ni-chelating Sepharose column (GE Healthcare). The column was washed with 
300 ml buffer A (25mM Tris, pH 8.0, 150mM NaCl, 25mM imidazole, 10% 
glycerol, 1 4M folic acid) and eluted with buffer A plus 500 mM imidazole. Peak 
fractions were pooled, digested with thrombin at a 1:1,000 mass ratio during 
overnight dialysis against buffer C at 4°C to remove imidazole, and loaded on a 
5-ml Ni-chelating Sepharose column (GE Healthcare) to remove the Fc His, tag. 
The flow-through was collected, adjusted to pH 5.6 and deglycosylated with endo- 
glycosidase He (New England Biolabs). Deglycosylated protein was finally sepa- 
rated by Sephadex S-200 gel filtration in 25 mM Tris, pH 8.0, 200 mM ammonium 
acetate, 1 mM EDTA and 1 uM folic acid. The protein eluted from the gel-filtration 
column at a volume corresponding to the size of a monomer at a purity >95% as 
judged by SDS-PAGE (Supplementary Fig. 1). 

Crystallization. Purified FRo: protein was concentrated to about 7 mg ml | before 
crystallization trials. Initial screening identified that polyethylene glycol (PEG) is 
favourable for crystal formation. Optimization trays using PEG were set up manu- 
ally using the hanging drop method at 20 °C. Needle-shaped crystals were obtained, 
which diffracted X-rays to about 9-10 A. To reduce glycosylation, FR« protein was 
expressed in the presence of 5 uM kifunensine (GlycoSyn)’* and purified FRa 
protein was further deglycosylated with endoglycosidase Hy (New England 
Biolabs) (Supplementary Fig. 1). Crystals were grown at 20 °C in hanging drops 
containing 1.5 ul of the purified protein and 1 pil of well solution (0.1 M MES, 
pH6.5, 12% (v/v) PEG 2000, 0.15 M potassium sodium tartrate). Crystals appeared 
within 5-6 days and grew to a dimension of ~250 j1m in length with a hexagonal 
shape by day 14. These crystals diffracted to 2.8 A at the Advanced Photon Source 
(APS) synchrotron, Life Sciences Collaborative Access Team (LS-CAT). 

Data collection and structure determination. Crystals were transferred to well 
solution with 20% (v/v) ethylene glycol as a cryoprotectant before flash freezing in 
liquid nitrogen. Data collection was performed at sector 21-ID-D (LS-CAT) of the 
APS synchrotron using single native crystals and the diffraction data were pro- 
cessed with HKL2000 (ref. 23). On the basis of Matthew’s coefficient calculation, 
the crystals have an unusually large unit cell with an estimate of 8-10 molecules per 
asymmetric unit. Initial structure determination by molecular replacement using 
riboflavin-binding protein (which shares 22% sequence identity with FR«) as a 
search model failed to yield any correct solution. To solve the phase problem, a 


heavy-atom derivative was prepared by soaking the native FRx crystals with a Pt 
salt before data collection. Also, six data sets of native FR« were collected at a 
wavelength of 1.77 A to measure the S anomalous signal to aid in structure deter- 
mination. These six data sets were processed using XDS”, combined using 
Pointless, and merged using Scala of the CCP4 suite” as previously described”. 
Merging multiple data sets increased the S anomalous signal and redundancy of 
the data, but also led to an increase of the merging R-factor’®. Initial phases were 
established by using the SHELX program’ with Pt-soaked derivative data and 
native data (Supplementary Table 1). Fifteen Pt atoms were found by SHELXD 
with a CC/CCyeak score of 31.4/17.1 (CC is the correlation coefficient between Ecaic 
and Ey, for all data and CC eq, is the correlation coefficient for 30% of reflections 
that were not used during the dual-space refinement). Subsequent phasing using 
SHELXE generated a contrast score of 0.8 and connectivity of 0.79 for the correct 
hand solution. Density modification for the initial electron density map was 
performed using DM”. A crude model was built automatically using the CCP4 
program buccaneer and improved by manual building using Coot’'. Phases were 
further improved by using the S anomalous data and a total of 29 S atoms were 
found based on the anomalous difference Fourier using the Phenix program”. The 
initial FRo. model was manually adjusted on the basis of the electron density map 
using the riboflavin-binding protein structure as a reference and the improved 
model allowed accurate location of eight molecules in one asymmetric unit by 
molecular replacement (Supplementary Fig. 9). The models were refined against 
the native data with eight-fold non-crystallographic symmetry restraints using the 
Refmac program of CCP4 (ref. 22). The densities for folic acid became clear after 
several rounds of model adjustments and refinements and eight molecules of folate 
were built into the model. The final model was refined to an R factor of 0.206 and 
an R¢ee factor of 0.256 (Supplementary Table 1). The Ramachandran statistics are 
87% in the favoured regions, 12.5% in additional allowed regions and 0.5% in 
generously allowed regions. 

Mutagenesis. Site-directed mutagenesis was carried out using the QuickChange 
method (Stratagene). Mutations and all plasmid constructs were confirmed by 
DNA sequencing. 

Radioligand-binding assay. The binding affinity of each FRa mutant was deter- 
mined by saturation radioligand-binding assay. 40 nM of each mutant in 100 ul 
binding buffer (25mM Tris, pH8.0, 150mM NaCl, 0.1% Triton X-100) was 
immobilized in the wells of a protein G-coated 96-well plate (Thermo Scientific) 
for 40 min. Endogenous ligand was stripped with 100 1l stripping buffer (25 mM 
acetate acid, pH 3.5, 150mM NaCl, 0.1% Triton X-100) for 1 min as described 
previously’®. After neutralizing and washing with 200 ul binding buffer, proteins 
were incubated for 40 min with 100 1 binding buffer supplemented with the 
indicated concentrations of [*H]-folic acid (Moravek Biochemicals). FRo-bound 
[?H]-folic acid was determined by scintillation counting following removal of 
unbound ligand by two 100-pl washes with binding buffer. Ky was determined 
by nonlinear regression using GraphPad Prism. 
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CORRECTIONS & AMENDMENTS 


CORRIGENDUM 
doi:10.1038/nature12369 


Corrigendum: Replication stress 
links structural and numerical 


cancer chromosomal instability 


Rebecca A. Burrell, Sarah E. McClelland, David Endesfelder, 
Petra Groth, Marie-Christine Weller, Nadeem Shaikh, 

Enric Domingo, Nnennaya Kanu, Sally M. Dewhurst, 

Eva Gronroos, Su Kit Chew, Andrew J. Rowan, Arne Schenk, 
Michal Sheffer, Michael Howell, Maik Kschischo, Axel Behrens, 
Thomas Helleday, Jiri Bartek, Ian P. Tomlinson 

& Charles Swanton 


Nature 494, 492-496 (2013); doi:10.1038/nature11935 


In this Letter we inadvertently omitted full details of The Cancer 
Genome Atlas data sets. The Acknowledgements should have included 
these sentences: “The results published here are in part based upon data 
generated by The Cancer Genome Atlas pilot project established by the 
NCI and NHGRI. Information about The Cancer Genome Atlas and 
the investigators and institutions that constitute The Cancer Genome 
Atlas Research Network can be found at http://cancergenome.nih.gov/. 
The data were retrieved through dbGaP authorization (accession num- 
bers phs000178.v4.p4 and phs000178.v5.p5).”. 


RETRACTION 
doi:10.1038/nature12383 


Retraction: Oligosaccharide ligands 
for NKR-P1 protein activate NK 
cells and cytotoxicity 


Karel Bezouska, Chun-Ting Yuen, Jacqui O’Brien, 
Robert A. Childs, Wengang Chai, Alexander M. Lawson, 
Karel Drbal, Anna Fiserova, Miloslav Pospisil & Ten Feizi 


Nature 372, 150-157 (1994); doi:10.1038/372150a0 and correction 
Nature 380, 559 (1996); doi:10.1038/380559a0 


We wish to retract this Article owing to an inability to reproduce the 
results. This retraction has not been signed by K.B. and A.F., and M.P. 
is deceased (J.O’B. cannot be traced). 
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ERRATUM 
doi:10.1038/nature12384 


Erratum: The importance of 
feldspar for ice nucleation by 
mineral dust in mixed-phase 
clouds 


James D. Atkinson, Benjamin J. Murray, 

Matthew T. Woodhouse, Thomas F. Whale, Kelly J. Baustian, 
Kenneth S. Carslaw, Steven Dobbie, Daniel O’Sullivan 

& Tamsin L. Malkin 


Nature 498, 355-358 (2013); doi:10.1038/nature12278 


In this Letter, the affiliation for Matthew T. Woodhouse was incor- 
rectly listed as CSIRO, Australia (affiliation number 2), whereas this 
should have been set as his ‘present’ address. The address for M.T.W. 
while the work in this Letter was being carried out was: Institute for 
Climate and Atmospheric Science, School of Earth and Environment, 
University of Leeds, Leeds, LS2 9JT, UK (affiliation number 1). This 
has been corrected in the HTML and PDF versions of the manuscript. 
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Altmetrics make their mark 


Alternative measures can yield useful data on achievement — but must be used cautiously. 


BY ROBERTA KWOK 


heavily promote their 2008 paper on digi- 
tal library tools. So it came as a surprise 
when, in August 2012, Pettifer got an e-mail 
from the Public Library of Science (PLOS), 
based in San Francisco, California. A PLOS 
representative told him that people had viewed 
or downloaded the article (D. Hull et al. PLoS 
Comput. Biol. 4, e1000204; 2008) more than 
53,000 times. It was the most-accessed review 
ever to be published in any of the seven PLOS 
journals. The paper had come out just as biolo- 
gists’ interest in digital publishing was building 
and the number of tools was exploding, says 
Pettifer, a computer scientist at the University 
of Manchester, UK. “It hit the right note at the 
right time,” he says. 
At one time, Pettifer would have listed the 


C= Pettifer and his colleagues did not 


paper on his CV accompanied by the journal's 
impact factor and the article’s number of cita- 
tions — in this case, about 80. But when he 
came up for promotion this year, he realized 
that tracking citations was not going to tell 
the whole story about the paper’s influence. 
Impact factor is a crude measure that applies 
only to the journal, not to specific articles, he 
says; citations take a long time to accumulate, 
and people may not cite a paper even if it influ- 
ences their thinking. So he added the number 
of views to the CV entry. And he did not stop 
there. 

Next to many of the papers listed, Pettifer 
added labels indicating scholarly and public 
engagement. The labels were generated by 
ImpactStory in Carrboro, North Carolina, one 
of several services that gauges research impact 
using a combination of metrics — in this case, 
a wide range of data sources, including the 


number of times a paper has been shared on 
social-media websites or saved using online 
research tools. 

When Pettifer submitted his annotated 
CV for the first round of promotion review, 
his mentor expressed confusion. He took a 
look and said, “What the hell are these badges 
doing in your CV?” recalls Pettifer. “But once 
I explained them, he said, “Well, give it a go” 
Pettifer submitted his CV for the second 
round — and got his promotion. He does not 
know for sure whether the metrics helped, but 
he plans to use them on future grant applica- 
tions. “I’m definitely a convert,” he says. 


OUTSIDE THE BOX 

‘Altmetrics, a term coined in 2010 by Impact- 
Story co-founder Jason Priem, refers to a range 
of measures of research impact that go beyond 
citations. Several altmetrics services have 
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> emerged in the past few years (see ‘Four 
ways to score’). They produce reports that 
gauge impact by taking into account not just 
academic citations, but also digital use and 
sharing of data — which can include the num- 
ber of times a paper has been tweeted, ‘liked’ 
on Facebook, covered by the media or blogs, 
downloaded, cited on Wikipedia or book- 
marked online. Some services also evaluate 
research products such as software, data sets 
and slideshows by tracking the number of 
people who have used or viewed the product 
online (see Nature 500, 243-245; 2013). 
Altmetrics offer researchers a way to show- 
case the impact of 
papers that have not 
yet gathered many 
citations, and to 
demonstrate engage- 
ment with the public. 
They can be accessed 
through journals or 
independent web- 
sites, and can track 
the impact of par- 
ticular data sets or 


papers, or evaluate “It hit the rig ht 
the combined influ- note at the right 
ence of publications ime. 

Steve Pettifer 


and products pro- 
duced by multiple 
researchers in a department. 

But these services must be used wisely. They 
are not meant for strict quantitative compari- 
sons; nor do they always distinguish between 
positive and negative attention. And although 
scientists can include altmetrics in job and grant 
applications and annual reports, they must 
select relevant data and clearly explain the con- 
text to avoid provoking mistrust or confusion. 

Some altmetrics services generate profiles 
that summarize the impact of a researcher’s 
products. ImpactStory allows scientists to 
import lists of items such as papers and soft- 
ware from existing user profiles at websites 
such as Google Scholar, which automatically 
tracks a researcher’s papers, or the online soft- 
ware-code repository GitHub. Scientists can 
also manually enter the digital object identifi- 
ers (DOIs) of their papers, or input their Open 
Researcher and Contributor ID (ORCID), a 
unique identifier that can be used to tag all of 


a researcher’s work. ImpactStory then creates 
a profile showing how frequently each product 
has been viewed, saved, discussed, cited or rec- 
ommended online. 

Other services take a more article-centric 
approach. Altmetric in London allows users to 
access data on individual papers using a book- 
marklet — a browser bookmark that executes 
JavaScript commands. (Altmetric is funded 
partly by Digital Science, a sister company to 
Nature Publishing Group.) Users install the 
bookmarklet in their Internet browsers; then, 
when they come across a paper that they are 
interested in, they click the bookmarklet button. 
A report pops up in the corner of the browser, 
providing altmetrics that include a score indi- 
cating how much online attention the paper has 
received. The score takes into account the num- 
ber of people who have read or mentioned the 
article, as well as the relative importance of the 
medium and the mentioner. Newspaper cover- 
age is weighted more heavily than tweets, and 
tweets by individuals more heavily than those 
by journals promoting their content. 

Many journals display some altmetrics on 
their sites automatically; these might be gen- 
erated in-house or provided by an external 
service. Every article published by PLOS, for 
example, includes an online metrics tab show- 
ing data such as views, downloads and social- 
media mentions. A feature called Article-Level 
Metrics Reports lets users search for PLOS 
papers by criteria such as author or keyword, 
and generates a summary metrics report for the 
set of results, including article usage by paper 
age and maps of authors’ locations. Several 
journal publishers, including Nature Publishing 
Group in London and Cell Press in Cambridge, 
Massachusetts, display data from Altmetric on 
their sites, and John Wiley & Sons in Hoboken, 
New Jersey, began a trial with the metrics firm in 
May. HighWire Press, an electronic-publishing 
platform at Stanford University in Palo Alto, 
California, is collaborating with ImpactStory 
to add altmetrics to its journal websites. 

Altmetrics enable scientists to see ripples 
generated by their research that might other- 
wise go unnoticed. Individual researchers 
can try to track buzz on their own, but data- 
aggregation and updating services make it 
much easier. These services also automate 
difficult tasks, such as finding all tweets that 


link to a particular paper; each article will have 
multiple URLs, so conducting such a search 
manually would be very time-consuming. 

The reports can even suggest potential 
collaborators or journals. For example, if an 
informatics paper is mentioned a lot by biolo- 
gists, the author might consider publishing 
his or her next article in a biology journal to 
increase exposure, says Heather Piwowar, 
co-founder of ImpactStory. 


MEASURES OF CAUTION 

Despite the benefits, researchers and evalua- 
tors must interpret altmetrics data cautiously. 
Data sets might not be comprehensive: not all 
services detect news stories that do not give 
URLS for the study, for example. The popular- 
ity of social-media sites changes over time, so 
it is unrealistic to expect a paper published in 
2008 to generate as many tweets as one pub- 
lished in 2013. And some disciplines, such 
as computational biology, are more active 
than others on social media, so comparisons 
between disciplines may be unfair. 

To get the most meaningful information, 
users should dig into the underlying data. 
Although a paper’s Altmetric score can sug- 
gest whether it is worth clicking through to the 
more detailed report, “qualitative assessment 
is far more important than the number’, says 
Euan Adie, founder of Altmetric. 

To help users to interpret the data, most 
services put numbers in context. Impact- 
Story normalizes data by publication year and 
includes percentiles — it might, for example, 
note that a given paper has more readers on 
the online reference manager Mendeley than 
97% of papers indexed that year. Altmetric 
shows results normalized by journal, which 
allows fairer comparison of papers in disci- 
pline-specific publications. And in May, PLOS 
began offering Relative Metrics, a service that 
lets users see how a paper compares to other 
PLOS articles in the same subject area, using 
tools such as graphs of article views. 

Including altmetrics in decisions on grants, 
hiring and tenure requires careful considera- 
tion. Gerald Rubin, executive director of the 
Howard Hughes Medical Institute's Janelia 
Farm Research Campus in Ashburn, Virginia, 
is sceptical of altmetrics that do not explicitly 
indicate quality, such as number of tweets. He 


FOUR WAYS TO SCORE 
A quartet of services offers free metrics reports that go beyond citations. 
ImpactStory Altmetric PLOS Article-Level Metrics Plum Analytics 
Products Papers, software, data sets and more Papers, data sets, some books Papers published by the Public | Papers, books, patents and 
tracked Library of Science (PLOS) more 
What you get | Profile page, metrics badges, application | Bookmarklet, metrics badges, API Summary reports, WordPress | Profile page (currently in 
programming interface (API; a means widget, API testing), API 
for software to access the altmetrics) 
Publishers Various, including eLife, Pensoft nclude Nature Publishing Group, Cell PLOS Medwave (forthcoming 
Publishers, PeerJ Press, BioMed Central this year) 
Major Alfred P. Sloan Foundation Digital Science PLOS, Alfred P. Sloan Self-funded 
funders Foundation 
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adds that altmetrics suffer from one of the 
same flaws as citation counts: a mediocre 
paper in a popular field will receive more 
attention than a first-rate paper in a small 
field. And including altmetrics in a job 
application? “At this point, I don't think 
anyone would pay attention,’ says Rubin, 
who looks at many applications. 

But some people do pay attention. 
Scientists are permitted to use altmetrics to 
demonstrate social impact in reports for the 
Research Excellence Framework (REF), an 
evaluation of UK academia that influences 
funding, notes Graeme Rosenberg, REF 
manager at the Higher Education Funding 
Council for England in Bristol. Plum Analyt- 
ics, an altmetrics company based in Dresher, 
Pennsylvania, and Seattle, Washington, this 
year completed a pilot project with the Uni- 
versity of Pittsburgh in Pennsylvania, in 
which it generated altmetrics profiles for a 
subset of researchers that could be aggre- 
gated by department. The next step is to roll 
out altmetrics profiles for the entire insti- 
tution, says company co-founder Andrea 
Michalek. Plum is also currently running 
projects with about ten other institutions. 

Rubin is better disposed towards altmet- 
rics that suggest a positive value judgement, 
such as the number of requests to use soft- 
ware. In that vein, Adie suggests that rather 
than simply reporting numbers, researchers 
should use altmetrics to find success stories 
that they can mention in their CVs or on 
their websites. The data might reveal that a 
non-governmental organization or a gov- 
ernment department took notice ofa paper, 
for example. Altmetric plans soon to start 
flagging up citations by agencies such as the 
World Health Organization and the Inter- 
governmental Panel on Climate Change, 
both based in Geneva, Switzerland. 

Context such as percentile ranks or 
explanations of data sources can help evalu- 
ators to interpret altmetrics. In Pettifer’s CV, 
he included a legend for his ImpactStory 
labels, listing some of the data sources, 
such as Mendeley, Twitter and Wikipedia. 
Piwowar suggests that researchers who 
worry that evaluators will view altmetrics 
negatively could start by including the data 
in annual performance reviews, which are 
lower-risk than grant or job applications. 

Some think that altmetrics will soon 
become a normal part of a CV. It used to 
be that researchers who wanted to dem- 
onstrate the importance ofa recently pub- 
lished article could only say, “Look, I really 
believe this is great research,’ notes Mike 
Thelwall, an information scientist at the 
University of Wolverhampton, UK. Now, 
he adds, “you can back up your words with 
alittle evidence” m 


Roberta Kwok is a freelance science writer 
in Seattle, Washington. 


TURNING POINT 


CAREERS 


Jason Weber 


Breast-cancer researcher Jason Weber of 
Washington University in St. Louis, Missouri, 
is struggling to maintain funding. As a mid- 
career researcher, he is part of the demographic 
in greatest jeopardy in the wake of US research- 
funding cuts (see Nature 498, 527-538; 2013). 
In May, he wrote an opinion piece about his 
plight in the St. Louis Post-Dispatch, which 
caught the attention of a US Senator. 


How did you end up studying breast cancer? 
As a postdoc at St. Jude Children’s Research 
Hospital in Memphis, Tennessee, I worked at 
the cutting edge of cell-cycle regulation, and 
my team discovered a key tumour suppressor. 
In 2001, I was hired to work in the then-new 
molecular-oncology division at Washington 
University in St. Louis, where researchers were 
mixing genomics with cancer biology and 
making the translational jump to the clinic. 
Breast cancer was an area where we could 
make a big impact clinically. 


Did it take you long to get your footing in that 
competitive field? 

It took a couple of years. The big break came 
in 2002, when I was named a Pew Scholar. The 
Pew Charitable Trusts, headquartered in Wash- 
ington DC, provide generous funding and con- 
vene scholars to collaborate and exchange ideas 
at an annual meeting. So I was interacting with 
a diverse group of Pew scholars, which helped 
me and my lab members to think outside the 
box and explore new techniques. We started 
going in many different directions — which led 
to an influx of money between 2007 and 2008. 


In what ways does your lab’s situation now 
differ from what it was five years ago? 

Back then, we had more than US$1.1 million 
in project funding from various sources: Susan 
G. Komen for the Cure, the American Cancer 
Society, two RO1 grants from the US National 
Institutes of Health (NIH), and a Department 
of Defense Era of Hope grant. I had 17 people 
in the lab. But my NIH funding recently ran 
out and did not get renewed. I currently have 
a $100,000 grant from a children’s foundation, 
and four people in the lab. 


How has the US government’s budget 
sequestration directly affected your lab? 

The sequester adds to the burden in terms of 
what gets funded in the grant-review process. 
Essentially, an RO1 grant application to the US 
National Cancer Institute has to be in the top 
6-8% to get funded. Yet there is little difference 
between a grant scoring in the top 5% and one in 


the top 15% — it becomes arbitrary. My greatest 
fear is that by trimming the fat, we're starting to 
hit muscle. Labs with 10 to 15 people who are 
doing solid work are getting the squeeze now. 


Why did you write your opinion piece on the 
impact of funding cuts? 

I just got fed up. None of my non-science 
friends had any idea how bad the cuts were. I 
wrote it after I laid off one of my best young 
scientists, and two of my PhD students switched 
career paths after they graduated because of 
concerns about funding. I didnt write a ‘woe is 
me piece; I wrote a ‘the public needs to better 
understand how these cuts actually affect the 
economy’ piece. It led to conversations with 
Senator Dick Durbin (Republican, Illinois). 
His staff called me to discuss the impacts of 
the sequester and the economic downturn on 
science funding. I got the sense that he is on 
our side at a time when it is difficult to find a 
congressional representative who is carrying 
the banner of scientific research in this country. 


What is your outlook like now? 

Bleak. It is frustrating to be stuck in front of the 
computer writing grants, instead of in the lab 
doing and guiding experiments. I have seven 
grant applications out right now, and Iam 
writing three more. 


What is most frustrating to you? 

Every politician says that to have a great econ- 
omy, we need a well-educated workforce. Yet 
although the government has the ability to 
maintain the highest level of that educated 
workforce, it chooses to slash science fund- 
ing through the sequester. It makes no sense 
to train people with PhDs and then not fund 
them. Scientists need to speak up. m 


BY VIRGINIA GEWIN 


22 AUGUST 2013 | VOL 500 | NATURE | 493 


© 2013 Macmillan Publishers Limited. All rights reserved 


Ua SCIENCE FICTION 


BY MARKO JANKOVIC 


he sunset was surreal. Tom 
| sat on the beach gazing at 
the dying star — a copper 
beacon falling slowly towards the 
murky depths beneath. Every 
now and then, the temperate 
ocean waters would nibble at his 
feet before shying away ina flurry 
of pearly white foam. His hands 
touched the warm sand beneath 
him; he could feel the tiny grains 
pressing against his palms. A 
breeze swept along the coastline, 
as if the ocean was drawing deep, 
uneasy breaths, waiting for some- 
thing to happen. 

Tom felt alone. 

So this is what it’s like to be the 
last man on Earth, he thought. 

He sat next to Tom, his eyes 
fixed on the crimson horizon. 
His skin was fair, his hair long 
and blue. To Tom, his proud countenance 
displayed the features of a champion from 
songs long forgotten. Yet for all the striking 
beauty of this ‘man in his late twenties, there 
was something disturbing about his face. 
His eyes, blue as the morning sky, perfectly 
reflected the sunset. 

He turned to Tom and smiled. 

“Well now. Do you have my answer?” 

His voice was tranquil, with tones that 
seemed as though they were woven from 
finest silk. Tom squinted as the Sun’s dying 
gold infused the clear waters, and rubbed 
his eyes. He, on the other hand, kept his eyes 
wide open, impervious to the stinging sun- 
light. Tom tooka short, shallow breath and 
licked his lips. 

“T think Ido” 

“So, Tom. What do you think makes a 
human feel — human?” 

The sound of waves gently breaking on the 
sands filled the silence. Although his voice 
was soothing to the point of sleep, a careful 
listener could discern a more sinister note. 
Tom knew that this mild-mannered person 
had killed everyone on Earth, leaving only 
him, Tom Anderson, alive. 

Not long had passed since the War of the 
Machines. Tom was witness to the ruthless 
purges the robots so 
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ALONE 


A universal feeling. 


overconfident faces of the human leaders as 
they proclaimed it would be a swift victory 
over the “insolent automata”. Three wounds, 
seven months and twelve front lines later, 
those faces existed solely upon billboards, 
posters and half-burnt pieces of newspaper 
covering the ruins of cities worldwide. 

After a breath, Tom was out of his memo- 
ries and back on the warm shore again. The 
machine was still looking directly at him, 
waiting for an answer. 

“Others.” 

“Others, Tom? Iam not sure I follow you” 

“What I think makes us feel human. It is 
others. We are incomplete — others com- 
plement us and make us whole. They are 
like mirrors scratched by solitude in which 
we try to catch a glimpse of ourselves — 
together, we erase each other's imperfec- 
tions. A blinded man grasps in the dark not 
for his eyes, but for a helping hand. A pris- 
oner would trade his ration gladly for a few 
words with the guard. To be truly alone — 
that is a wish no human would make,” 

Tom finished the speech in one breath. It 
was a relief and a burden at the same time, to 
comprehend that great a truth. He felt empty, 
forsaken and defeated. He felt — alone. 

He smiled. Tom saw a discrete twitching 
of the replicant’s eyebrow. He stood up and 
brushed the sand from his clothes, not once 
lifting his gaze from Tom's face. 

“Thank you, Tom. That is all we wanted 
to know.” 
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He sat at his fake-oak desk in the 
office overlooking the beach. His 
name was Martin Gardner. He 
was 28 and chief executive of the 
Artificial Intelligence Consor- 
tium — the most powerful indus- 
trial entity in the Solar System. 
The board members sat in front 
of him, eagerly awaiting the infor- 
mation that might be their ticket 
to almost endless wealth. 

“So, how did it go?” 

“He fell for it. Hook, line and 
sinker? 

The fair-skinned, blue-haired 
man reached for a cigar, lit it up 
and puffed a large, satisfying 
cloud of Cuba. 

“All of the programming has 
been auto-rewritten. He is now 
entirely convinced that he is 
human? 

“Auto-rewritten?” 

“Yes... That puzzles me a bit. 
He made himself think he is human — we 
didn't add a single line of code to the soft- 
ware.” 

“How is that even possible?” 

“Dont know. Dont really care” 

An uneasy note of doubt shook his usually 
immaculate diction — he covered it quickly 
with the briskness of his voice. 

“All that matters is that we are now ready 
for mass production. Not knowing who they 
work for, why or, most importantly, what 
they are, these robots will be perfect spies. 
Contact the army. I want a contract on my 
desk by 4 oclock”” 

A satisfied cheer filled the office. The 
board members rose to congratulate one 
another, already slavering over the future 
state of their bank accounts. Martin watched 
their euphoria through a thick, silvery mist 
of tobacco. He looked over his shoulder at 
the window. Far behind him, on a darken- 
ing beach, quiet and still, sat a machine. He 
thought of its words. 

Martin turned his blue eyes back to the 
celebration. Here, amid the jubilation and 
laughter, surrounded by his colleagues, he 
drank in the smug display of naked avarice. 

And he felt alone. = 


Marko Jankovic is an intern medical doctor 
at a hospital for infectious and tropical 
diseases. He sometimes puts down the 
stethoscope and takes up the pen in the name 
of science fiction. 
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‘‘Disentangling nestedness’’ disentangled 


ARISING FROM A. James, J. W. Pitchford & M. J. Plank Nature 487, 227-230 (2012) 


Analytical research indicates that the ‘nestedness’ of mutualistic net- 
works facilitates the coexistence of species by minimizing the costs of 
competition relative to the benefits of facilitation’. In contrast, James 
et al.” recently argued that a more parsimonious explanation exists: 
the persistence of a community and its constituent species depends 
more on their having many interactions (high connectance and high 
degree, respectively) than for these interactions to be organized in any 
particular manner. Here we demonstrate that these conclusions are an 
unintended consequence of the fact that the methodology of ref. 2 
directly changed the number of interactions of each species—and hence 
their expected persistence. When these changes are taken into account, 
we find a significant, positive relationship between nestedness and 
network persistence that reconfirms the importance of nestedness in 
mutualistic communities’’. There is a Reply to this Brief Communi- 
cation Arising by James, A., Pitchford, J. W. & Plank, M. J. Nature 500, 
http://dx.doi.org/10.1038/nature12381 (2013). 

Given a network, one can robustly quantify the relative numbers of 
specialist to generalist species via the degree distribution**. A network’s 
degree distribution is of considerable importance, because studies have 
repeatedly highlighted the significant, positive relationship between a 
species’ number of mutualistic partners and its survival probability’ **. 
This distribution alone is also capable of driving many higher-order 
network properties’, not to mention the fact that the degrees of species 
are phylogenetically constrained themselves*. For these and other rea- 
sons, studies across the ecological-network literature**” have empha- 
sized the need to take the degree distribution into consideration when 
assessing the significance of the myriad patterns observed in nature”""’. 

Unfortunately, when comparing empirically observed networks to 
random networks, the authors of ref. 2 seem to have overlooked this 
critical link between changes in the degree distribution and species’ 
survival. As a direct consequence, the specialists in their random net- 
works became less specialist and the generalists less generalist*. Yes, 
the random networks were observed to be more persistent (Fig. 1a), 
but this was not in fact an indication that nestedness is unimportant’. 
Instead, this increase in persistence was a result of the random net- 
works having more homogeneous degree distributions”'’, and that 
the most vulnerable species in the empirical networks almost always 
had more interactions in the corresponding randomizations. Here 
this distinction is of critical importance because species’ degrees are, 
in fact, “a better predictor of individual species survival”. “The more 
the merrier” indeed”. 

To quantitatively validate these results, we repeated a key analysis 
of ref. 2 to measure the relationship between nestedness and persist- 
ence while paying explicit attention to changes in the network’s degree 
distribution (Methods). On taking the small but critical step of con- 
trolling for the increased homogeneity of the degree distributions, we 
observe a significant, positive relationship between nestedness and 
persistence (Fig. 1b). In addition, we reach the same conclusion whether 
we account for changes in the degree distribution statistically or by 
repeating the analysis while generating the randomized networks with 
a null model that explicitly maintains the observed degree distribution 
(Fig. 1c, Methods and Appendix). All else being equal, our results here 
illustrate that, the greater the nestedness of a community, the greater 
indeed is that community’s persistence. 

Given an observed number of species and interactions in a com- 
munity, a prevailing question across the ecological literature is whether 
or not some ways to structure those interactions (for example, nested- 
ness) lead to more persistent communities. Although the number of 


a_ Uncontrolled for 


b Controlled for statistically 


Partial residuals on logit(persistence) 


¢ Controlled in null model °° 


0.0 0.2 0.4 0.6 0.8 


Nestedness 


Figure 1 | Within our regression analysis, the relationship between 
nestedness and persistence in mutualistic networks depends integrally on 
changes in the degree distributions of the networks. a, If these distributions 
are allowed to change but are uncontrolled for, nestedness appears to be 
negatively correlated to persistence (P< 10“). b, c, However, when these 
changes are appropriately controlled for—either statistically (b) or in the null 
model for randomization (c)—there is a significant positive relationship 
between nestedness and persistence (P< 10 *andP<10 “4, respectively). The 
same general conclusions reached here for the probabilistic null model hold for 
other, non-degree-preserving randomizations’. 


mutualistic interactions of a species plays an important role in its 
survival***"*, we find unambiguous support for the added importance 
of the way in which mutualistic interactions are organized—the true 
architecture of biodiversity'*. Echoing ref. 2, our findings re-emphasize 
the importance of carefully considering the interplay between all 
potential sources of variation’’ in ecological models. Otherwise, one 
runs the risk of further entangling models that are sufficiently tangled 
already. 


Methods 


For 59 empirical networks, we generated 250 randomized networks and for each we 
simulated persistence (the fraction P of surviving species in each simulation) across 
250 parameterizations of a dynamic mutualistic model’. We quantified the relation- 
ship between persistence and nestedness with a mixed-effects logistic regression’ 
that takes the form logit(Pj,) = Bo + BiM; + BoC; + B3Wig + BaNy + ni + rij + eae. 
Here the indices i, j and k indicate the empirical network, network randomization 
and model parameterization, respectively, /o is a constant, the slopes 1, /2, B3 and 
B4 quantify the importance of network magnitude’ M, connectance’ C, relative 
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degree homogeneity’* W, and nestedness” N, respectively, the random effects n; 


and rj control for variance across networks and randomizations, and ¢; is the 


model residual. Variance inflation factors gave no indication of multicollinearity in 
this model. 
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Appendix 

We randomized the empirical networks with two null models: the probabilistic and 
fixed (or swap) algorithms’. For our purposes here, the key distinction between the 
two is that the probabilistic model generates random networks with quantitatively 
more homogeneous degree distributions than those observed empirically (Wj > 0) 
whereas the degree distribution is strictly conserved in networks generated by the 
fixed model (Wj = 0). The statistical analyses presented here were performed in R 
version 2.15.3 (http://R-project.org/) using the glmer function in package Ime4 
version 0.999999-0 (http://Ime4.r-forge.R-project.org). Code to perform the network 
randomizations and dynamic simulations in Matlab (http://www.matlab.com/) 
and the mixed-effects logistic regressions in R (http://R-project.org/) is available 
from the Dryad Digital Repository at http://dx.doi.org/10.5061/dryad.p2gq8. 


REPLYING TO S. Saavedra & D. B. Stouffer Nature 500, http://dx.doi.org/10.1038/nature12380 (2013) 


Saavedra and Stouffer’ claim that the results of James et al.? are a 
consequence of the method used to randomize interaction matrices. 
We recognize the importance of examining alternative randomization 
schemes and have repeated our analysis using their methods. 
However, we find no evidence that ‘reconfirms the importance of 
nestedness in mutualistic communities’’. 

Repeating the analysis of figure 2 in ref. 2, using the swap rando- 
mization scheme’, which does not change degree distribution, con- 
firms our finding that the persistence of real networks is not related to 
their nestedness. Although more of the empirical networks are less 
nested than their randomized counterparts under this scheme, con- 
trary to the accepted result of ref. 4, there is no useful correlation 
between nestedness and persistence (Fig. 1). Therefore, the ‘small 
but critical step’ of accounting for degree heterogeneity does not 
produce a positive relationship between nestedness and persistence. 

The results in figure 1 of ref. 1 represent relationships between 
nestedness and persistence among randomizations of individual net- 
works. They do not imply that, given two observed networks, the 
more nested network is more likely to have the higher persistence 
as claimed in ref. 5. We have performed the general linear mixed 
model (GLMM) analysis advocated in ref. 1. This shows that >90% 
of the variance comes from variance between groups (networks) and 
<10% comes from variance within groups. This highlights the lack of 
consistency across the groups, and that any effect of nestedness is 
dwarfed by the random effects of the GLMM. 

The NODF* definition of nestedness used in refs 1 and 2 is one of 
several possible metrics. For example, the nestedness metric used in 
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Figure 1 | Accounting for degree distribution does not give a meaningful 
relationship between nestedness and persistence. Repeating the 

results of figure 2 in ref. 2 using the swap randomization scheme, 

which does not change connectance or degree distribution, there is no 
useful correlation between the change in nestedness (relative to the 
empirical network) and the change in the persistence of the dynamic 
model. Each point represents the average of 100 randomizations of an 
empirical network. 


©2013 Macmillan Publishers Limited. All rights reserved 


BRIEF COMMUNICATIONS ARISING 


ref. 5 is invariant under the swap randomization scheme of ref. 1, 
precluding a GLMM approach. Under the metric of ref. 7, the 
GLMM reveals a negative relationship between nestedness and per- 
sistence. That the conclusions of ref. 1 are sensitive to the choice of 
metric indicates that they cannot be used to draw general conclusions 
about the effects of nestedness. In contrast, the results in ref. 2 are 
robust to the choice of metric. 

Is nestedness important for predicting persistence in these models? 
Our results, confirmed by the methods of ref. 1, show that it is less 
important than: network size; connectance; degree distribution; 
intrinsic growth rates; competition coefficients; and the strength of 
the mutualistic interactions. If two ecosystems can be found that share 
all these properties then, under the specific dynamic model tested 
here, the more nested ecosystem may (depending how nestedness is 
defined) be more likely to have a higher persistence. However, if any of 
these properties differ between the two ecosystems, then any effect of 
nestedness is likely to be unimportant. 

In conclusion, nestedness is an interesting abstract network prop- 
erty that undoubtedly influences the statistical behaviour of large 
systems of differential equations’. However, general conclusions 
allowing nestedness to be used as a predictor of empirical biodiversity 
cannot currently be justified. 
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